THE ELLIPTIC LAW 
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Abstract. We show that, under some general assumptions on the entries of a random 
complex n x n matrix X n , the empirical spectral distribution of -^=X n converges to the 
uniform law of an ellipsoid as n tends to infinity. This generalizes the well-known circular 
law in random matrix theory. 



1. Introduction 



Let X n be a n x n matrix with complex eigenvalues Ai, A2, • • • , A n . The empirical spectral 
measure fix n of X n is defined as 



n 
i=l 



1 

n — ' % 



and the corresponding empirical spectral distribution (ESD) F Xn (x,y) is given by 



1 



F x "{x,y) := n #{l <j<n: Re(A i ) < x,Im(A i ) < y}. 

Here #E denotes the cardinality of the set E. In the case when the eigenvalues of X n are 
real, we write the ESD F Xn as just a function of x, 

F Xn (x) := -#{1 <j<n:\j< x}. 

A fundamental problem in random matrix theory is to determine the limiting distribution 
of the ESD as the size of the matrix tends to infinity. In certain cases when the entries have 
special distribution, such as Gaussian, the joint distribution of the eigenvalues can be given 
explicitly, and so the limiting distribution can be derived directly. However, these explicit 
formulas are not available for many random matrix ensembles, and so the problem of finding 
the limiting distribution becomes much more difficult. On the other hand, the well-known 
universality phenomenon in random matrix theory predicts that the limiting distribution 
should not depend on the distribution of the entries. We give two famous examples below. 

In the 1950s, Wigner studied the limiting ESD for a large class of random Hermitian matrices 
whose entries on or above the diagonal are independent [H]. In particular, Wigner showed 
that, under some additional moment and symmetry assumptions on the entries, the ESD 
of such a matrix converges to the semi-circular law F sc with density given by 

F >( X ).-J -2< x <2 
0, otherwise 
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The most general form of the semi-circular law assumes only the first two moments of the 
entries [SJ. 

Theorem 1.1 (Semi-circular law for Wigner matrices). Assume that X n is a n x n Her- 
mitian matrix whose entries on or above the diagonal are independent. Further assume that 
the diagonal entries are i.i.d. real random variables and those above the diagonal are i.i.d 
complex random variables with variance one. Then the ESD of the matrix -^X n converges 
almost surely to the semi-circular law as n — > oo. 

The ESD for non-Hermitian random matrices with i.i.d. entries was first studied by Mehta 
|27j . In particular, in the case where the entries of X n are i.i.d. complex normal random 
variables, Mehta showed that the ESD of ~^X n converges, as n — > oo, to the circular law 

^cir 

given by 



In other words, F c [ r is the two-dimensional distribution function for the uniform probability 
measure on the unit disk in the complex plane. 

Mehta used the joint density function of the eigenvalues of ~^X n which was derived by 

Ginibre |10j . The real Gaussian case was studied by Edelman in [TJ. For the general 
(non- Gaussian) case when there is no formula, the problem appears much more difficult. 
Important results were obtained by Girko I12j . Bai [H [3], and more recently by Gotze 
and Tikhomirov [18J, Pan and Zhou [34J, and Tao and Vu [4T]. These results confirm the 
same limiting law under some moment or smoothness assumptions on the distribution of 
the entries. Recently, Tao and Vu (appendix by Krishnapur) were able to remove all these 
additional assumptions, establishing the law under the first two moments [42J. 

Theorem 1.2 (Circular law for non-Hermitian i.i.d. matrices). Assume that the entries 
of the n x n matrix X n are i.i.d. copies of a complex random variable with mean zero and 
variance one. Then the ESD of the matrix -^X n converges almost surely to the circular 
law as n —> oo. 

The two celebrated results above provide a somewhat complete picture of the limiting law 
for the ESD of Hermitian and non-Hermitian i.i.d matrices. In the 1980s, Girko initiated 
a study of the limiting distribution for more general matrices which interpolate between 
Hermitian and non-Hermitian models. 

Definition 1.3 (Condition CO). Let £2) be a random vector in C 2 where both £1 and £2 
have mean zero and unit variance. Let {xij} be an infinite double array of random variables 
on C. For each n > 1 we define the random nx n matrix X n — (^Cij)i<ij<n' 

We say that the 

sequence of random matrices {X n } n >i satisfies condition CO with atom variables (^1,^2) if 
the following conditions hold: 

(i) (Independence) {xu : i > 1} U {(xij,Xji) : 1 < i < j} is a collection of independent 
random elements, 

(ii) (Common distribution) each pair (xij,Xji), 1 < i < j is an i.i.d. copy of (^1,^2), 

(iii) (Flexibility of the main diagonal) the diagonal elements, {xn : i > 1}, are i.i.d. with 
mean zero and finite variance. 
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It is clear that many Hermitian and non-Hermitian i.i.d matrix ensembles belong to the 
above class. In fact, it also consists of linear combinations of independent Hermitian and 
non-Hermitian i.i.d. matrices. 



Over the past thirty years, Girko has established a number of results for the limiting law 
of random matrices satisfying condition CO. We refer the reader to |13 } 114 ] [15 | 116 } 117] and 
the references therein. To our best understanding, Girko's proofs are incomplete and lack 
rigor. The familiar reader may also relate this to Girko's controversial works on the circular 
law (see the discussions in [HE]). 

When {X n } n >i is a sequence of random matrices that satisfy condition CO with jointly 
Gaussian atom variables (£i, £2), the joint eigenvalue density can be derived explicitely and 
the limiting ESD can be computed directly; see [211 122^ [23] and references therein. Recently, 
Naumov [29j has been able to verify the same limiting law for a much more general class of 
real random matrices whose entries have finite fourth moment. 

For — 1 < p < 1, denote by S p the ellipsoid 

8 -/z€C.^)L + ^i<l 

r (i-p) 2 + (i+p) 2 - 

Theorem 1.4 (|29|). Let {X n } n >i be a sequence of real random matrices that satisfy con- 
dition CO with real atom variables (£1,^2) where E[^i^] = p> — 1 < p < 1. Also, assume 
that max(E|£i| 4 , E|£2| 4 ) < °°- Then the ESD of the matrix -j^X n converges in probability 
as n — )• 00 to the elliptic law F p with parameter p given by 



F p (x, y) := — - ^ mes (z G £ p : Re(z) < x, lm(z) < yj . 

In conjunction with Theorems 1 1 . 1 [ [i~2] and with the universality phenomenon, it is tempting 
to conjecture that Theorem 1 1 . 4| should hold without any further moment assumption. One 
of the main goals of this paper is to resolve this conjecture for the real case. 

For any matrix M, we define the Hilbert-Schmidt norm ||M||2 by the formula 

||Af || 2 := y/tT(M*M) = y/tr(MM*). (1) 

Theorem 1.5 (Elliptic law for real random matrices). Let {X n } n >i be a sequence of real 
random matrices that satisfy condition CO with real atom variables (£i,£2) where E[£i£2] = 
p, — 1 < p < 1. Assume that {F n } n >i is a sequence of deterministic matrices such that 
rank(F n ) = o(n) and sup n ^[l-Fnll! < 00 • Then the ESD of -A=(X n + F n ) converges almost 
surely to the elliptic law with parameter p as n —> 00. 



In fact, we are able to extend Theorem 1.5 to the following more general setting. 

Definition 1.6 ((p, p)-family). Given parameters < p < 1 and — 1 < p < 1, we say that 
the complex random variable pair (£1, £2) belongs to the (p, p) -family if the following holds. 



(i) Both £1 and £2 have mean zero and unit variance; 
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(ii) E[(Re(6)) 2 ] = E[(Re(£ 2 )) 2 ] = p and E[(Im(6)) 2 ] = E[(Imfe)) 2 ] = 1 - p; 

(iii) E[Re(6)Re(6)] = PP and E[Im(6)Im(6)] = -(1 - \t)p\ 

(iv) E[Re(^)Im(^)] = for any i,j G {1,2}. 

Notice that if (£i,£ 2 ) belongs to a (p, p)-family then E|£i| 2 = E|£ 2 | 2 = 1 and E[£i£ 2 ] = P- 
More importantly, we do not require the imaginary and real parts of £1, £2 to be independent. 

Theorem 1.7 (Elliptic law for complex random matrices). LetO < p, < 1 and —1 < p < 1 be 

given. Let {X n } n >i be a sequence of complex matrices such that {X n } n >i satisfies condition 
CO with atom variables (£i,£ 2 ) from the (p,, p)- family. Assume furthermore that {F n } n >i 
is a sequence of deterministic matrices such that rank(F ra ) = o(n) and sup n ^H-Fnll 2 < 00. 
Then the ESD of -^(X n + F n ) converges almost surely to the elliptic law with parameter p 
as n —> 00. 



In light of the universality phenomenon, we conjecture that Theorem 1.7 continues to hold 
when E|£i| 2 = E|£ 2 | 2 = 1 and 



In this optimal setting, the ESD of 



p, where p is a complex number satisfying |p| < 1. 
X n is conjectured to converge to the elliptic law 



associated with the rotated ellipsoid £ p given by 
(Re(z) cos : 



z e C 



Im(z) sin ^) 2 (Re(z) sin | + Im(z) cos |) 2 



(i-IH) 2 



;i + M) 2 



< 1 



where 6 = Arg(p). 



1.8. Overview and Outline. For a n x n matrix M we let 

o-i(M) > cj 2 (M) > • • • > o~ n (M) > 
denote the singular values of M. In particular, a n (M) is the least singular value of M. 



As in the proof of the circular law [3T], the main difficulty in proving Theorems 1.5 and 



1.7 is controlling the least singular value of X n + F n . Theorem 2.1 below gives a lower 



bound for the least singular value motivated by the work of Tao and Vu [H]. Because of 



its importance, we prove Theorem 2.1 first; the proof is contained in Sections |2j|6} Section 
[7] is dedicated to proving our main results, Theorems |1.5| and 1.7 



1.9. Notation. For a matrix M we use the notations r,(M) and Cj(M) to denote its i-th 
row vector and its j-th column vector respectively; we use the notation (M)ij and M%j to 
denote its (i,j) entry. We let ||M|| 2 denote the Hilbert-Schmidt norm of M (defined in ([!])) 
and let ||M|| denote the spectral norm of M. 

Here and later, asymptotic notations such as 0, 0,0, a;, and so forth, are used under the 
assumption that n — > 00. A notation such as Oc(-) emphasizes that the hidden constant 
depends on C. If a = 0(6), we write b <C a or a ^> b. If a = 0(6) and 6 = 0(a), we write 
a x 6. 

As customary, we use rj to denote random Bernoulli variables (thus r] takes values ±1 with 
probability 1/2). For a given < p < 1, we use rj^ to denote random Bernoulli variables 
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of parameter p (thus fp^ takes values ±1 with probability fx/2 and zero with probability 
1-M). 

We write and a.e. for almost surely, Lebesgue almost all, and Lebesgue almost 

everywhere respectively. 

We use %/— 1 to denote the imaginary unit and reserve i as an index. 

2. The least singular value problem 



One of the main ingredients to prove Theorems 1.5 and 1.7 is the following polynomial 
bound on the least singular value. 

Theorem 2.1 (Bound on the least singular value for pertubed random matrices). Assume 
that M n = F n + X n , where the entries of the given complex matrix F n are bounded by n a 
in absolute value, and X n is a random matrix from Theorem \1. 7| for given < \i < 1 and 
— 1 < p < 1. Then for any B > 0, there exists A > depending on B,a,fJL,p such that 



P(on(M„) < n~ A ) < n- B . 

We emphasize that our polynomial bound here is motivated by |4H Lemma 4.1] of Tao and 
Vu, which plays a fundamental rule in their establishment of the circular law. We also refer 
the reader to the work [36] of Rudelson and Vershynin for an almost complete treatment 
for the least singular values of random non-Hermitian matrices. Recently, a similar study 
for random real symmetric matrices has been carried out independently by Vershynin in 
|47j and by the first author in [32]. In this paper we choose to follow |32j simply because 
our goal is to proving the universality for a broad range of random matrices. Nevertheless, 
because the matrix X n under consideration is much more complicated than symmetric or 
Hermitian ones, it is of great necessity to generalize and string a series of previous results 
[33] . [3T] and [32] altogether here. As a result, our ideas will not be fully original but a 
highly non-trivial generalization of existing ones. The rest of this section is devoted to 
sketching the approach, more details of the proofs will be presented subsequently. 

For the sake of simplicity, we will prove our result under the following condition. 
Condition 1. With probability one, \xij\ < n B+1 for all i,j. 

In fact, because all Xij have bounded variance, we have P(\xij\ > n B+1 ) = 0(n~ 2B - 2 ). 
Thus, we can assume that \xij\ < n B+1 at the cost of an additional negligible term o(n~ B ) 
in probability. 

We next assume that a n (M n ) < n~ A . Thus M„x = y for some ||x||2 = 1 and ||y||2 < n~ A . 
There are two cases to consider. 

2.2. Case 1. M n has full rank. This is the main case to consider as most of random 
matrices are non-singular with very high probability. 
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Let C(M n ) = (cij(M n )), 1 < i,j < n, be the matrix of the cofactors of M n . By definition, 
C(M n )y = det(M n ) • x, and thus we have ||C(M„)y|| 2 = | det(M n )|. 

By paying a factor of n in probability, without loss of generality we can assume that the 
first component of C(M n )y is greater than det(M„)/n 1 / 2 , 

\c u {M n ) yi + ... c ln (M n )y n \ > | det(M„)|/n 1 / 2 . (2) 
Note that ||y||2 < n~ A , it thus follows 



2 M M n)\ 2 > « 2A_1 | det(M n )| 2 . (3) 
i=i 

For j > 2, we write 

n 

Cy(M n ) = y~]mqCjj(M„_i), 
i=2 

where M n _i is the matrix obtained from M n by removing its first row and first column, 
and Cjj(M n _i) are the corresponding cofactors of M„_i, and my are the entries of M n . 

Hence, by the Cauchy-Schwarz inequality, by Condition [TJ and by the bounds fij < n a for 
the entries of F n , we have 



| Cli (M n )| 2 < Y.UMY.U^Mn-i)? 

<n 2B+2a+ ^U\^{Mn-i)\ 2 . (4) 
Similarly, for j = 1 we write c\i{M n ) = X^=2 m i2Q2(-^n-i), and thus, 



|c n (M„)| 2 < n 2B + 2 «+ 3 £ MM^OI 2 . (5) 
It follows from @,((4]) and ([5]) that 

£ MM^)! 2 > n 2 ^ 2B - 2 - 4 | det(M n )| 2 . 

2<M<n 



Hence, for proving Theorem 



2.1 



it suffices to justify the following result. 
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Theorem 2.3. For any B > 0, there exists A > such that 



P(( M^n-l)| 2 ) 1/2 > n A \ det(M n )|) < n 

2<i,J<n 



To see why the assumption (X^2<« ?<n l c *i 

(M^i)! 2 ) 1 / 2 > n A |det(M n )| is useful, we next 
express det(M n ) as a bilinear form of its first row and column, 



det(M n ) = cn(M n )m u + ^ Cij(M n -x)mumji. 

2<«J<n 

In other words, with c := (^2<i j<n \ c ij{^n-\)\ 2 ) 1 ^ 2 (which is nonzero as M n has full rank) 
and with ay := Cj 3 (M n _i)/c we have 

- det(M n ) = -miiCn(M„) + ^ a^mumji. (6) 

2<i,i<n 

Intuitively, if we condition on M n _i and mn, then the right hand side of Q, as a bilinear 
form of the random variables xu,xn,2 < i, is comparable to 1 in absolute value with 
probability extremely close to one. Thus the assumption P(| det(M n )|/c < n~ A ) > rT B of 



Theorem 2.3, with appropriately large A, must yield a high cancelation of the bilinear form. 
Basing on this intuition, our rough approach will consist of two main steps below. 

• Step 1 (Inverse step). Assume that for appropriately large A we have 



B 



Px 11 ,...,x ln ,x2i,...,x nl (\(cii(M n )/c)mn+ ^ a^mxiVtij^ < n A |M n _i) > n 

2<i,j<n 

Then there must be a strong structure among the cofactors Cij of M n _i. 

• Step 2 (Counting step). The probability, with respect to M n _i, that there is a 
strong structure among the Cy is negligible. 

Before stating the steps above in greater detail, we pause to introduce the structure ap- 
pearing in our analysis. 

A set Q C C is a generalized arithmetic progression (GAP) of rank r if it can be expressed 
as in the form 

Q = {do + ^151 + • • • + k r g r \ki S Z, Ki < ki < K- for all 1 < i < r} 
for some {g , . . .,g r }, {K 1} K r } and {K[, K' r }. 
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It is convenient to think of Q as the image of an integer box B : = {(&i, . . . ,k r ) E T7\Ki < 
ki < K^} under the linear map 

$ : (ki, . . . , k r ) i->- go + fagi H h k r g r . 

The numbers g% are the generators of Q, the numbers K[ and ifj are the dimensions of 
Q. We say that Q is proper if this map is one to one, or equivalently if \Q\ = \B\. For 
non-proper GAPs, we of course have |Q| < \B\. If —Ki = K[ for all z > 1 and go = 0, we 
say that Q is symmetric. 

We refer the reader to Sections [3] and [4] for further explanation as to why GAPs is the right 
object to study here. In the sequel we state our main steps rigorously with the help of 
GAPs. 

Theorem 2.4 (Step 1). Let < e < 1 be given constant. Assume that 

5upPx 2) ...,x„,4,...,<(l ai i( Xi + ^ x 'i + ~ a l - n ~ A ) - n B 

° 2<i,j<n 

for some sufficiently large integer A, where 

• ay = aj(M n -i)/c, 

• fi = hi, f'i = fn are the entries of F n , and thus fixed, 

• {xi,x'A are i.i.d copies of (^1,^2) of a given (p, p)- family with < p < 1 and 
-1 < p < 1. 

• any collection x^,. . . , Xi k , x'^ ,...,Xj of random variables are mutually independent 
as long as the indices i\, . . . , ik,ji, ■ ■ ■ ,ji o,re distinct. 

Then there exists a complex vector u = (u\, . . . , u n -i) which satisfies the following proper- 
ties. 

• (orthogonality) 1 1 u 1 1 2 ~ 1 and either |(u, rj(M n _i))| < n^ A ^ 2+ ° B - c ^ /or n — Ob j(E (1) 
rows o/M n _i or |(u, Ci(M„_i))| < n ~ A / 2+ ° B ^ f or n _ B e (l) columns of M n _ x ; 

• (additive structure) there exists a generalized arithmetic progression Q of rank O b ,<SX) 
and size n° B '^ that contains at least n — 2n e components ui; 

• (controlled form) all the components Ui, and all the generators of the generalized 
arithmetic progression are rational complex numbers of the form | + \/— 1^, where 
\p\,\q\,\p'\,\q'\ < n m+0 B , e {l)_ 



In the second step of the approach, we show that the probability for M n _i having the above 
properties is negligible. 

Theorem 2.5 (Step 2). With respect to M n —\, the probability that there exists a vector u 
as in Theorem\2.4\ is exp(— f2(n)). 
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2.6. Case 2. M n does not have full rank, which is the case to consider if £i, £2 have discrete 
distribution. We show that this event holds with probability at most n~ B . 



First, instead of the entries x%j of X n , consider x[j := (1 — e 2 )xij + e£y, where £ tJ 

lOOOAn 



arc 



independently uniform on the interval [—1,1] and e is very small, say n wmAn . It is clear 
that the continuous matrix M' n = X' n + F n , where X' n is formed by the x\a above, has full 



rank with probability one. By applying Theorem |2.1| obtained from Case 1 for the matrix 
M' n , with probability at least 1 — n~ B one has a n (M^) > n~ A , and thus 



\det(M' n )\>n- An . (7) 

Next, because M' n = M n — e(exij and as \x%j\ < n +1 , by Brunn-Minkowski inequality 

and Hadamard's bound we have 



I det(i\<)| < (| det(M n )| 1 /- + O („-50M )) n ) 

where we use the fact that A is chosen sufficiently large compared to B. 

Combining with 0, we then infer that |det(M„)| > n -^ 1+0 ^ An , and thus det(M„) / 
with probability at least 1 — n~ B , concluding the treatment for this case. 



The proof of Theorem 2.4 will be given in Section [5] thanks to useful tools from Sections [3] 



and [H Theorem 12.51 will be concluded in Section [6] 

3. Anti-concentration, a warm-up 



Recall that in the inverse step, Theorem 2.4 we assumed that 



SUpPx 2 ,..„x„,4,...,<(l Yl a i]( X i + fi)( X 'j + fj) ~ a \ < n A ) > n B ■ (8) 

This can be considered as a high concentration of the bilinear form ^2 2 <i j<n a ij{ x i~^ fi){ x 'j~ s <~ 
fj) on a small ball of radius n~ A , where x% and x' i are not necessarily jointly independent. 
The main idea to extract this bit of information is to relate it to a high concentration of an 
appropriate linear form. This step is postponed until Section [4] Our goal now is to focus 
on linear forms. 

A classical result of Erdds [9] and Littlewood-Offord [26] in the 1940s asserts that if ai 
are complex numbers of magnitude |a,| > 1, then the probability that the linear form 
Ya=i aiXi concentrates on a disk of radius one is of order 0{n~ 1 / 2 ), where Xi are i.i.d. copies 
of a Bernoulli random variable. Recently, motivated by inverse theorems from additive 
combinatorics, Tao and Vu studied the underlying reason as to why the concentration 
probability of Ym=\ aiXi on a sman Dan is large. They call this the inverse Littlewood-Offord 
problem. A closer look at the definition of generalized arithmetic progressions defined in 
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Section ^1 reveals that if a% are very close to the elements of a GAP of rank 0(1) and 
size n°(~, then the probability that 121=1 a i x i concentrates on some small ball is of order 
u~ 0<k1 \ where Xi are i.i.d. copies of a Bernoulli random variable. 

It was shown implicitly by Tao and Vu in |38l l4"Tj HI] that these are essentially the only 
examples that have high concentration probability. An explicit and somewhat optimal 
version has been proved in a recent paper by the first author and Vu in |33j . Before stating 
this result, we pause to introduce some terminology. 

We say that a real random variable £ is anti- concentrated if there exist positive constants 
ai, a?2, 03 such that P(«i < |£ — £'| < 012) > 03, where £' is an i.i.d. copy of £. (Note that 
the requirement of anti-concentration is somewhat weaker than having mean zero and unit 
variance.) We say that a complex number a £ C is 5-close to a set Q C C if there exists 
q £ Q such that \a — q\ < 5. 

Theorem 3.1 (Inverse Littlewood-Offord theorem for linear forms, |33j). Let < e < 1 
and B > 0. Let (3 > be an arbitrary real number that may depend on n. Suppose that 

127=1 \ ai \ 2 = !> an d 



n 

supP x f| ^aijxi + fi) - a\ < j3j = 7 > n~ B , 
a i=i 

where x = (x%, . . . , x n ), and X{ are i.i.d. copies of a real random variable £ satisfying the 
anti- concentration condition. Then, for any number n' between n e and n, there exists a 
proper symmetric GAP Q = {Y2i=i ^i9i '■ £ Z, \ ki\ < Li} such that 

(i) (control of rank and size) Q has small rank, r = 0^ e (l), and small cardinality 

\Q\ <msx(o B , e (^=),l \ ; 

(ii) (control of the steps) there is a non-zero integer p = Os, e (v / ra') such that all steps gi 
of Q have the form gi = f3^, with pi e Z and pi = O b ,e{P~^ Vn/) . 

(Hi) (good approximation) at least n — n' elements of cij are (5-close to Q; 



Here the implied constants are allowed to depend on 01,0:2 and 0-3. The interested reader 
is also invited to read |36j for a similar but milder setting of the inverse Littlewood-Offord 
for linear forms. 



To attack Theorem 2.4, the first step is to study the concentration of a more general linear 
form Yli aiXi + bix'i, where (xi, x'j) are i.i.d copies of a pair random complex variables (£1, £2) 
from a given (/i, p)-family. Intuitively, as E|£i| 2 = E| ^"2 1 2 = 1 and |E£i^2| = \p\ < 1> the 
random variables £1 and £2 are not totally dependent on each other. (See for instance Claim 
A. 2 of Appendix |A| for a more precise statement.) This fact may suggest a way to apply 



Theorem 3.1 with respect to X2,...,x n while holding x 2 ,.-.,x 2 "fixed", and vice versa. 



One of the main results is to justify this intuition. 
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Theorem 3.2 (Inverse Littlewood-Offord theorem for mixing linear forms). Let < < 

1, — 1 < p < 1 and < e < 1, B > 6e given. Let f3 > be an arbitrary real number that 
may depend on n. Suppose that Oj, b{ G C suc/i i/iai X^iLi l a «| 2 + S£=i l^«| 2 = ^ an< ^ 



supP X)X /M ^(ajXj + 6jX-) -a|</3J=7> 



i=l 

w/iere 

• (xijx'j) are i.i.d copies 0/(^1,^2) /rom a given (/i, p)- family, 

• any collection x^,. . . , %i h , x'j x , . . . , 0/ random variables are mutually independent 
as long as the indices i\, . . . , ifc, ji, ■ ■ ■ ,ji o,re distinct. 

Then there exist positive constants a,co,Co depending on (^1,^2) and two pairs of complex 
numbers (01,02) and (c'^c^) (which may depend on n) such that 

• |ci|, I C2 1 , \c'i\, \ c' 2 \ are bounded from below and above by cq and Cq respectively, 

• |ci/c 2 - c'x/c^l > a > 

• for any number n' between n e and n, there exists a proper symmetric GAP Q = 
l£i=l ki9i : ^ Z, < Li} C C whose parameters satisfy (i) and (ii) of Theorem 

and for at least n — n' indices i, the pairs 0^04 + c 2 bi, c[ai + c' 2 bi are (3-close to 



3.1 



Q. 



As Theorem 3.2 can be shown by modifying the proof of Theorem 3.1 from [33J, we postpone 



its proof until Appendix [Al We now introduce several useful corollaries of it. 



Firstly, by choosing bi = 0, Theorem |3.2| immediately implies the following version of 
Theorem 13. II 



Corollary 3.3. The conclusion of Theorem 3.1 also holds if Xi are i.i.d. copies of a complex 
random variable £ satisfying E|£| 2 = 1 and E£ = E[Im(£)Re(£)] = 0. 

Secondly, if we (3- approximate the components of Q,c^ by rational numbers of the form 
p/Qi \p\i \q\ = 0(/3), then we obtain the following. 



Corollary 3.4. Assume as in Theorem 3.2. Then there exist two pairs of complex numbers 
(c\, C2) and (c' 1; d 2 ) for which |cj|, |c£| are bounded from below and above by Co and Co, \c\/c2 — 
c 1 / c 2 1 > a ' an d the components ofci,c' { are rational numbers of the form p/q, \p\, \q\ = 0(f3) 
such that for any number n' between n e and n, there exists a proper symmetric GAP Q = 
{S[=i ^i9i ■ k{ G Z, \k{\ < Li} C C whose parameters satisfy (i) and (ii) of Theorem 3.1 
and for at least n — n' indices i the following holds: 

• ai are 0((3)-close to the GAP Pi := , C2 , ■ Q + , c ' 2 , ■ Q; 

1 y ' ' 1 ciC2-c' 1 c 2 ^ cic' 2 -c[c 2 ^' 

. b t are 0((3)-close to the GAP P 2 := • Q + • Q; 
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consequently, Oj and bi are 0((3)-close to the combined GAP P = CiC i C ^ c i C2 ■ Q + 

./ ./ 2 1 

Cl 



Q + , C1 , ■ Q + , 1 , • Q. 



ClC 2 — C,C2 ^ ClC 2 — C,C2 ^ ClC, — CiC2 



Notice that the rank of P is (9b i£ (1) and the size of P is re ^ 1 ). Roughly speaking, the 
fact that the parameters involved in P are rational numbers will enable us to control the 
number of such GAPs easily. We will exploit this pleasant fact in more details in Sections 
landU 



We remark that the assumption — 1 < p < 1 is necessary because Theorem 3.2 is not valid for 
the boundary case \p\ = 1. For instance, if £ is a symmetric random variable and if x\ = —Xi 
(in which case p = —1), then the assumption P x ,x'(l J27=i a i x i + ^i x '% — «| < /?) > n~ B is 
equivalent to P x (| X^=i( a * — ^i) x i ~ a \ — P) — n ~ B - From here, only information for the 
a, — bi can be deduced but not for the individual a\ and 6, separately. 



Finally, the conclusion of Theorem 3.2 is somewhat optimal. Indeed, assume that there 
exist (ci, c'i), (c2, c' 2 ) with cf + cf = c 2 + c' 2 2 = 1 and c\d 2 / such that £1 = ciV>i + d-^i 
and £2 = C21P1 + 02^2) where ^2 is an independent copy of Then the assumption 
p x,x'(l YJi=ii a i x i + kx'i) ~ a\ < fi) > n~ B becomes P^.fl E™=i( c i a i + C2&i)V>ii + (c'i a i + 
d^>i)^2i) — o,\<(3)> n~ B . So, as ipij are independent, structural information for ciOj + C2&i 



and d x ai + c' 2 6j can be deduced using Theorem 3.1 as in the same way we concluded using 
Theorem [3T2l 



4. Anti-concentration of bilinear forms 
We will next apply Corollary |3.4| to infer an inverse version for the concentration of the 



bilinear form Yli<i j<n a ij( x i + fi)( x 'j + fj) appeared in Theorem 2.4 

Theorem 4.1. Let 0<e<l,|/o|<l and B > be given. Let (3 > be an arbitrary real 
number that may depend on n. Assume that Ylij \ a ij\ 2 = 1 an ^ 

supP XjX '(j ^2 a ij( x i + fi)( x 'j + fj) ~ a \ < P) = 7 > n ~ B , 
a i<M< n 

• (xi,x!;) are i.i.d copies of (£1,^2) from a given (p, p) -family 

• any collection x^ , . . . , Xi k , x'^ ,...,Xj of random variables are mutually independent 
as long as the indices i\, . . . , ik,ji, ■ ■ ■ ,ji o,re distinct. 

Then, there exist an integer k ^ 0, \k\ = n° B ^ l \ a set of r = 0(1) rows Tn, . . . , Vi r of the 
array A n = (aij)i<ij< n , and set L of size at least n — 2n € such that for each i G L, there 
exist integers ka x , . . . , ku r , all bounded by n® 3 ^^, such that the following holds. 



Pz 



r 

(\{z,k Ti (A n ) + Y J hi ] v i] {A n ))\ < (3n° 3 ^) > n-°^«, (9) 
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where z = (z\, . . . ,z n ) and Z{ are i.i.d. copies of rj^^fa — £2); where £' 2 is an i.i.d. copy 
of £2 and y/ 1 / 2 ) is a Bernoulli random variable of parameter 1/2 independent of & and ^' 2 . 

We remark that this result is an analogue of |31| Theorem 1.8] in which case we studied 
the concentration of the quadratic forms of type a^x^Xj. It seems plausible that after 
an appropriate linear transform we can trap most of the entries a« of A n into a GAP of 
small size and small rank (in the spirit of [30]). However, we do not proceed this matter 



here. Roughly speaking, in application to justify Theorem 2.1 we just need the conclusion 



of Theorem 4.1 for only one row. 



To prove 4.1, we will follow the machinery from |31j with some extra twists. As the first 



step, we free the dependencies between x and x'. 

4.2. Decoupling lemma. Let U be an arbitrary subset of {1, . . . , n} such that both of U 
and U are of size Q(n). Let A\j be a matrix of size n by n defined as 



Au(ij) 



a,ij if i G U and j G U or i G U and j G U, 
otherwise, 



where we denoted by Au{ij) the ij entry of Ajj. We prove the following lemma by a series 
applications of Cauchy-Schwarz inequality. 

Lemma 4.3. Assume that 



7 = sup p X)X > ( 1 a ij x i x 'j + + % x 'i -a\ < p) > 



where x, x' are defined as in Theorem J^.l. Then, 



Pv,w(| Yl A u(ij>iWj\ = O b (/V^R) =6( 7 4 ), (10) 

l<i ,j<n 

where v = {v\, . . . , v n ), w = (wx, . . . , w n ), and 

• (vi,Wi) are i.i.d copies of a vector (6 — £^,£2 — £2); where (£[,£'2) ^ s an independent 
copy 0/(6,6), 

• any collection v i x , . . . , Vi k , w'^ , . . . , w'j of random variables are mutually independent 
as long as the indices i\, . . . ,ik,ji, ■ ■ ■ , ji are distinct. 

An advantage of considering the sum Yli<i j<n Au(ij)viWj over the original form ^ . aijXix'j 
is that we can rewrite the former as YlieU ^j eU a ij v j) x i + Y^ieu(52jeU a ji x j)Vi- Thus, if 



all Xj,yj,j G U are held fixed, Theorem 3.2 applied to ([lOJ) allows us to extract useful 
information on ^2jeu a ij v j an< ^ ^2jeu a ji x j- As the proof of Lemma 
postpone it until Appendix [B| 



4.3 



is standard, we 
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We next apply Theorem 3.2 to obtain the following key structure for the entries of Ajj. 



Lemma 4.4. There exist a set Io(U) of size Ob |£ (1) and a set I{U) of size at least n — n e , 
and a nonzero integer k(U) bounded by n° B < € ^ such that for any i € /, there are integers 
hi (U),io £ Iq(U), all bounded by n° B ^ l \ such that 



P y (\{k(U)Ti(A v )+ ^ k ii0 {U)v i0 {A u ))-Y\<Pn OB ^)=n-°^ l \ (11) 



where y = (yi, . . . ,y n ) and yi are i.i.d copies of £2 — £2- 



Proof, (of Theorem 4.1 assuming Lemma 4.4) See |31|. Section 4] 



□ 



For the rest of this section we prove Lemma 4.4 using Lemma 4.3 First of all, as Ajj = 
Ag, it is enough to verify (11) for any index i from U. Also, it suffices to assume £ to 



have discrete distribution. The continuous case can be recovered by approximating the 
continuous distribution by a discrete one while holding n fixed. 



We begin by applying Corollary 3.4 



Lemma 4.5. Assume as in the conclusion of Lemma Then, the following holds with 
probability at least 3^/4 with respect to and wg. There exist a proper symmetric GAP 
P W{J C C of rank Os j£ (l) and size n° B ' t<yl \ and an index set C U of size \U\ — n € such 
that (ri(Au),Wjj) is f3-close to P v , v for all i G Iw - 



Proof, (of Lemma 4.5) Write 



52 a ijViWj + 52 "'/ '''"'/ = /. aijWj)vi + (52 a ii v i) w i 

= 52^ Au ^ w u) y i + 52( r ^ AuT ^ v 



U/ W i- 



We say that a pair vector (v^, w^) is good if 



V[7,W[7 



i&U 



We call (v£r,w^) bad otherwise. 



Let G denote the collection of good pairs. We are going to estimate the probability p of a 
randomly chosen pair (v[/,W[/) being bad by an averaging method. 



THE ELLIPTIC LAW 



15 



Pv^w^Pvy.wt, (I ^2{ri(Au),Wv)vi + ^2{r i (A u T ),v )w i - a\ < 0) = 7 

+ 1 — p > 7 
(l- 7 )/(l-7/4)>p. 

Thus, the probability of a randomly chosen (vjj,W(j) belonging to G is at least 

l-p> (3 7 /4)/(l- 7/4) > 3 7 /4. 
Consider a good vector (v^w^-) S G. By definition, we have 



1 V[7,W[7 (l^(ri(^l/),w £7 )7; i + ^(r i (^ C 7 T ),v £7 )u; i - a| < 0) > 7/4. 
Next, if (r,(^4[/), wp-) = for all z, then the conclusion of the lemma holds trivially for 



P-wjj '■= 0. Otherwise, we apply the last conclusion of Corollary 3.4 to the sequence 
{(ri(Au), Wjy-}, (ri(Ajj T ), Vj/), i £ [/} (after a rescaling). As a consequence, we obtain an 
index set I Wf? C U of size \U\ — n e and a proper symmetric GAP P VJv C C of rank Os :£ (l) 
and size n° B ' €<yl \ together with its elements ^(w^), such that \(ri(Au), wp) — gi(wi;)| < /3 
for all z G Av^- □ 



4.6. Property of the (/j(w^)'s. We now work with the GAP elements qi(v/jj), where 
v/fj € G. Because these points occupy the most part of an integer box, we can infer a great 
deal of structural relation among them. To do this, we first pause to introduce a pleasant 
property of generalized arithmetic progressions. 

Assume that P = {k\g\ + • • • + k r g r \ — Ki < hi < K{\ is a proper symmetric GAP, which 
contains a set U = {ux, . . . .u n }. We consider P together with the map <3? : P — > R r which 
maps h\gi + • • • + k r g r to (ki, . . . , k r ). Because P is proper, this map is bijective. We know 
that P contains U, but we do not know yet that U is non-degenerate in P in the sense that 
the set &(U) has full rank in R r . In the later case, we say U spans P. The following lemma 
states that we can always assume this without loss of any additive structure. 

Lemma 4.7. Assume that U is a subset of a proper symmetric GAP P of size r, then there 
exists a proper symmetric GAP Q that contains U such that the followings hold. 

• rank(Q) < r and \Q\ < O r (l)\P\. 

• U spans Q, that is, 4>(U) has full rank in R, raJlk (Q). 



We refer the reader to |31| Theorem 2.1] for a short proof of this lemma. 
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Common generating indices By Lemma 4.7 we may assume that the ^(w^-) span P Wo - 
We choose s indices i Wl , . . . , i Ws from J Wf? such that qi y . (wp) span P W£? , where s is the rank 
of P W£? . Note that s = B)£ (1) for all G G. 

Consider the tuples (i Wl , ■ ■ ■ ,iw s ) for all Wjj G G. Because there are ^] s OB if (n s ) = n° B ' e ^ 
possibilities these tuples can take, there exists a tuple, say (1, . . . , r) (by rearranging the 
rows of Ajj if needed), such that (i Wl , . . . , i Wg ) = (1, . . . , r) for all W;j G G", a subset G' of 
G which satisfies 



P W[? (w^ G G') > P w ,(wy G G)/n°°«W = 1 / n O B ,^). (12) 

Common coefficient tuple. For each 1 < i < r, we express qi{~W(j) in terms of the 
generators of P W£? for each Wj/ G G', 



&( W {/) = C i i(wj 7 )gi(wj 7 ) H hC ir (wp)5r r (w £7 ), 

where Cji(wj/), . . . Cj r (wj/) are integers bounded by n° B >^ l \ and ^(w^) are the generators 
ofP W£r 

We will show that there are many that correspond to the same coefficients Cj,-. 



Consider the collection of the coefficient-tuples y(cn{wjj), ■ ■ ■ , ci r (wp)) ; . . . ; (c r i(w£r), . . . c rr (wg r" 
for all G G'. Because the number of possibilities these tuples can take is at most 

( n O s , E (l))r 2 = n O Slt (l) i 

There exists a coefficient-tuple, say ^(cn, . . . , c\ r ), . . . , (c r i, . . . c rr )^ , such that 

((cii(w£7), . . . , c lr (w£7)); . . . ; (cri(w^), . . . c rr (w#))) = ((en, . . . , c lr ), (c r i, . . . c rr )) 
for all Wjj G G" , a subset of G' which satisfies 

P w& (w^ G G") > P w ,(wf; G G')/n°^ > 7 / n o S ,e(D. (13) 

In summary, there exist r tuples (en, . . . , ci r ), . . . , (c r i, . . . c rr ), whose components are in- 
tegers bounded by n° B ' €<yl \ such that the followings hold for all G G" . 

• qi(\Vff) = cngi(wff) H h c jr g r (w ), for i = 1, . . . , r. 

• The vectors (en, . . . , ci r ), . . . , (c r i, . . . c rr ) span z rank ( p w a )_ 
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Next, because |/ Wf? | > \U\ — rf for each -Wy E G", by an averaging argument, there exists 
a set J C U of size |Z7| — 2n e such that for each i £ I we have 



P W£7 (* E / W£7 ,w^ E G") > P W£7 (wj7 E G")/2. 



(14) 



From now on we fix an arbitrary row r of index from /. We will focus on those wy E G" 
where the index of r belongs to / W y • 

Common coefficient tuple for each individual. Because q(wy) E P Wi} (q(wy) is the 
element of -P w ^ that is /3-close to (r,Wjj)), we can write 



g( W jj) = Ci(w^)5i(\V^) + . . . Cr(Wj;)^ r (w^) 

where Ci(wy) are integers bounded by rP 3 '^. 

For short, for each i we denote by v, the vector (cji, . . . , Cj r ), we will also denote by v rjW& 
the vector (ci(wj/), . . . c r (wj;)). 

Because is spanned bw qi(wy), • • • , q r (wy), we have A; = det(vi, . . . v r ) / 0, and that 



kq(wy) + det(v r)W(? , v 2 , . . . , v r )gi(w^) H h det(v r>W£? , vi, . . . , v r _i)g r (w#) = 0. (15) 

It is crucial to note that k is independent of the choice of r and Wy. 

Next, because each coefficient of (15) is bounded by n - 8 -^ 1 ), there exists a subset G" of G" 



such that all E G r correspond to the same identity, and 



P wp (w^ E G'l) > (P w& (wj; E G // )/2)/(n°^^ 1 ))'- = 7 /n°^ = n' *^. (16) 

In other words, there exist integers k\,...,k r depending on r, all bounded by n° B '^ l \ such 
that 



kq(yvfj) + kxqxiwfj) H h k r q r (yvy) = 



(17) 



for all Wn E G" 



4.8. Passing back to Ay. Because qi(wy) are /3-close to (rj,Wjj), it follows from (17) 
that 



(kr,Wy) + (kiri,Wy) H h (& r r r , Wj/) = (fcr + fciri H hr r ,Wjy) <n° B ^ft. 
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Furthermore, as P w& (wjj G G") = n~° B ^ l \ we have 

P W£7 (|(fcr + feiri + • • • + k r r r , w D )\ < n° B -^p\ = rT° B ^. 



(18) 



As (18) holds for any row r indexing from I, this completes the proof of Lemma 4.4 



5. Random matrix: the inverse step 



We now give a proof of Theorem 2.4 We first apply Theorem 4.1 to a%j to obtain 



P z (\(z,kr l (An-i) + Y, k ^A A n-i))\ <n- A+ ° B *W) >n-° B -W. 
j 

For short, we denote by r[ the vector fcrj(^4 n _i) + J2j kiij r ij{.A-n-\)- Thus, for any i £ I, 



P z (|(z,r^)| < n -A+OB,.W) > 



(19) 



Set 



K = n- A /\ 



We consider two cases. 



Case 1. {non-degenerate case). There exists io £ I such that ||rj H2 > K. Because r' io = 
kv io {A n _i) + Y^jeio k ioj r j( A n-W), r- is orthogonal to n - \I \ - 1 = n - Ob, € (1) column 
vectors of M n -\. 

Set 



v :=r io /||r io || 2 . 



Hence, (v, Cj(M n _i)) = for at least n — Os )£ (l) column vectors of M n _i. 



Also, it follows from (19) that 



Pz(|(z,v)| < n -A/2+0 B ,,W) > n -OB,.(l). 



(20) 



Next, Corollary 3.3 applied to (20) implies that v can be approximated by a vector u as 
follows. 



Ui - v. 



I < n -A/2+O s , £ (l) for all L 
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• There exists a GAP of rank Ob i£ (1) and size n° B ' e ^ that contains at least n — n e 
components Uj. 

• All the components Ui, and all the generators of the GAP are rational complex 
numbers of the form - + \/— 1^, where \p\, \q\, \p'\, \q'\ < n A / 2+ ° B ' t(l \ 

Note that, by the approximation above, we have 1 1 u. 1 1 2 x 1 and |(u, Cj(M n _i))| < n^ A ^ 2+ ° B ^^ 
for at least n — Ob i£ (1) column vectors of M n -\. 

Case 2. {degenerate case) 1 1 x*^ 1 1 2 < K for all i £ I. Hence, with Iq := . . . ,i r } 



HAxiC^-i) + *W(A,-i)||a = H^lla < K. (21) 
j&Io 

Next, because ^ • 1 1 c^- ( -Ari 1 ) 1 1 2 = there exists an index jo such that ||cj (A n _i)||2 > rT 1 ! 2 . 

Consider this column vector. 



It follows from (21) that for any % 6 I, 



\kCj Q (i) + ^ feyCjp U)\<K. 
j&Io 

The above inequality means that the components Cj Q (i) of Cj (A n _i) belong to a GAP gen- 
erated by Cj (j)/k,j € Io, up to an error K. This suggests us the following approximation. 

For each j ^ I, we approximate Cj (j) by a number Vj of the form (1/[2K~ 1 \) ■ Z such that 
\ v j ~ c io(i)l — We nex t se t 



t>j : — ^ kij Vj j k 

for any i £ I. 

Thus, f i belongs to a GAP of rank Ob j£ (1) and size rP B >^ for all i & I. 
With v = (ui, . . . , v n -i), we have 

||v-c j0 (A n _i)|| 2 <Kn° B ^\ 

Furthermore, by Condition [TJ and because (cj (.A n _i), rj(M n _i)) = for i ^ jo, we infer 
that 

|(v,ri(M„_i))| < Kn° B ^. 
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Note that 1 1 v 1 1 2 3> rT 1 ! 2 . Set u := LVIMkJ • v ; we then obtain 

• |(u,ri(M n _i))| < n -A/2+o B J\) f or n _ 2 rows of Af n _i. 

• There exists a GAP of rank Ob i£ (1) and size n ^^ 1 ) that contains at least n — 2n e 
components Uj. 

• All the components U{, and all the generators of the GAP are rational complex 
numbers of the form - + \/— l^V, where |g|, |(/| < n" 4 / 2 "*" ^^ 1 ). 

6. Random matrix: the counting step 



We now give a proof of Theorem 2.5 Our argument, which follows the "divide and conquer" 
strategy, is simple and purely combinatorial. We note that a similar but simpler treatment 
for symmetric matrices has appeared in |32^ Section 5]. 

For convenience, let us replace M n _i by M n . We will consider the case |(u, rj(M n ))| < 
n -A/2+o Sj£ (i) f orn _ Og £ (l) rows of M n only, the reamaining case |(u, Cj(M n ))| < n~ A / 2+ ° B ^ 1 ^ 
can be treated indentically. 

Let M be the number of such structural vectors u. Because each GAP is determined by its 
generators and dimensions, the number of Q's is bounded by 



#{Q, there exists u such that u G Q} = ( n 2 A+o fl , e (i))O s , E (i)( n o s , e (i))O s , e (i) = n O AB ,e(i) 

Next, for a given Q of rank 0^ je (l) and size n° B ' t<yl \ there are at most n n ~ 2nt \Q\ n ~ 2n " = 
n OB,e(n) wa y S t choose the n — 2n e components Ui that Q contains. Because the remaining 
components belong to the set {- + i^r, \p\, \q\, \p'\, \q'\ < n, yl / 2+0s > 6 W}, so there are at most 
(- n 2A+o s , E (i))2n e = n O AiB , e (n e ) wayg to criOOSe them. 

Hence, we obtain the key bound 



M < n A,B, t { 1 ) n °BA n ) n A,B,e{ ne ) = n °S, e (")_ 



(22) 



Set /3o := n a / 2 +°bA 1 ) ^ the bound obtained from the conclusion of Theorem 2.4 For a 
given vector u, we define P i a (u) as follows 



P A (u) := p(|(u,r 4 (M n ))| < /3 for n - B , e (l) rows of M„_i). 

For the sake of discussion, let us pretend for now that the rows of X n are independent. By 
definition, the vector u is orthogonal to almost every row of M n . Thus, if u is fixed, the 
probability of this event is bounded by 
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p A) (u) < (Pxdmx! + • • • + u nXn \ < MT~° {1) ■= i n -° {1 \ 

where x\, . . . ,x n are i.i.d. copies of £. 

Now, if 7 is small, say n - ^ 1 ), then Pa (u) is n~^ n \ Thus the contribution of these P^ (u) 
in the total sum ^ u P/3 (u) * s negligible, taking into account of the bound of M. 



Next, if 7 is comparably large, 7 = n~ 0<yl \ then by Theorem 3.1, most of the components 
Ui are close to a new GAP of rank 0(1) and of size 0(7 _1 /y / n). This would then enable us 
to approximate u by a new vector u' in such a way that | (u', rj(M n ))| is still of order 0(/3o) 
and the components of u' are now from the new GAPs. The number N' of these u' can be 
bounded by (7 /n 6 )™, while we recall that P^u') is of order j~ n . Thus, summing over 
u' we obtain the desired bound 



^P ft (u') < #{ new GAPs /n € ) n 7~ n = 0(n^ tn+ ° w ). 

u' 

To our model M n = F n + X n of independent entries, we will mainly follow the heuristic 
above. Our strategy is to classify u into two classes: 

(i) B' contains of u of very small P ( g (u), and thus Xuefi' Pft)( u ) * s negligible; 

(ii) the other class B contains of u of relatively large P^u). To deal with those u of the 
second type, we will not control J2ueB P/3o( u ) directly but passing to a class of new 
vectors u' that are also almost orthogonal to many rows of M n , while the probability 
X^ u ' ■f > /9o( u ') * s °f or der 0(n~ en ). 

What makes our analysis harder is that the estimate P i a (u) < (P x (|uixi + • • • + u n x n \ < 
Po)) n ~°^ is no-longer valid for our random matrix model. 



6.1. Technical reductions and upper bounds for P^ (u). By paying a factor of n ° B ^ l > 
in probability, we may assume that |(u, rj(M n ))| < /?o for the first n — Ob j(E (1) rows of M n . 
Also, by paying another factor of n" E in probability, we may assume that the first no 
components Uj of u belong to a GAP Q, and u no > l/2\/n — 1 (recall that u X 1), where 



no := n — 2n e . 



We refer to the remaining Uj's as exceptional components. Note that these extra factors do 
not affect our final bound exp(— Q(n)). 

For given /3 > and i < no, we define 



7g (u) := supP^^.^dxiUj H \- x no u m - a\ < f3), 

a 
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where Xi, . . . , x no are i.i.d copies of £. 

A crucial observation is that, by exposing the rows of M„_i one by one, and due to sym- 
metry (i.e. Xij is independent from all other entries except Xji), the probability P,a(u) that 
|(u, rj(M n _i))| < j3 for alH < n— Os i£ (l) can be bounded by 



P/3(u) < ] { STipP Xi> ,., tXn _ 1 (\xiUi H h Xn-Wn-i - a\ < ft) 

l<i<n-0 B}C (l) a 

< Yl sn P F Xi,...,x no (\ x i u i H \~ x no u no - a\ < f3) 



l<i<no a 



= n ^V). ( 23 ) 

l<i<no 

Also, because u no > l/2y/n — 1, there exist positive constants cx,C2 such that C2 < 1 and 
for any /3 < c\j1\Jn — 1 we have 



7fl fe) (u) < supPa (Ix^itn,, -a\<P) 

a 

< l-c 2 . (24) 

Thus, 



P^u) < (1 - c 2 p = (1 - c 2 ) 



(l-o(l))n 



6.2. Classification. Next, let C be a sufficiently large constant depending on B and e but 
not A. We classify u into two classes B and ,8', depending on whether P ( g (u) > n~ Cn or 
not. 



Because of (22), and C is large enough, 



P/3 ( u ) < n° B ^/n Cn < n- n ' 2 . (25) 

ueB' 

For the rest of this section, we focus on u € B. 

6.3. Approximation for "compressible" vectors. Let B\ be the collection of u € 
satisfying the following property: for any n' components u^, . . . , Ui , among the u\, . . . , u no , 
we have 

supP,* )Xi ,(|u il a; il + ---+« in/ a; in ,-a| < n" 5 " 4 ) > (nT 1/2+o(1) . (26) 
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Here we set 



ri := n l - e . 



For concision we set ft = n~ B ~ 4 . It follows from Theorem 3.1 that, among any u^, . . . , Ui , , 
there are, say, at least n'/2 + 1 components that belong to a ball of radius ft (because our 
GAP now has only one element). A simple covering argument then implies that there is a 
ball of radius 2/3 that contains all but n' — 1 components U{. 

Thus there exists a vector u' £ (2/3) • (Z + y/— 1Z) satisfying the following conditions. 

• |itj — u'J < 4/3 for all i. 

• takes the same value u for at least no — n' indices i. 

Because of the approximation and of Condition [TJ whenever |(u, rj(M n _i))| < /3o, we have 

|(u', ri(M„_i))| < n(n B+1 + n Q )(4/3) + /3 := /3'. 



It is clear from the bound on ft and fto that ft' < c\/2y/n — 1, and thus by (24), 



P/3'(u') < (1-C2) 



(l-o(l))n 



Now we bound the number of u' obtained from the approximation. First, there are 
O(n n - no+n ') = 0(n 2n ~ e ) ways to choose those u[ that take the same value u, and there are 
just 0(/3 _1 ) ways to choose u. The remaining components belong to the set (2/3)" 1 • (Z+iZ), 
and thus there are at most O((ft~ 1 ) n ~ no+n ') = 0(n° A ' B ^ nl ')) ways to choose them. 

Hence we obtain the total bound 



P/3 ( u ) < ^ P / 3 '( U ') - 0(n 2nl ~ c )0(n° A - B ^ nl ~^)(l - C2 ) {l - o{1))n 

u6Bi u' 

< (1 -c 2 ) (1 " o(1))n . 

6.4. Approximation for "incompressible" vectors. Assume that u G B 2 := B\B\. By 
exposing the rows of M n _i accordingly, and by paying an extra factor = 0(n n e ) in 
probability, we may assume that the components u no _ n ' + i, . . . ,u no satisfy the property 



SUpP s , +x ,...^ no (|«n -n'+lZn -n'+l H + «n ^n - »| < n B 4 ) < (n) V2+o(l) 

a 

< n -l/2+6/2 +0 (l)_ (2?) 
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Preparation. Next, define a radius sequence [3 k ,k > where /3q = n a / 2 +°bA 1 ) j s the 



bound obtained from the conclusion of Theorem 2.4 and 



Afe+l := (n B+2 + n Q+1 + l) 2 /3 fc . 



Recall from (23) that 



l<i<no—n' 

Roughly speaking, the reason we truncated the product here is that whenever i < no — n' 
and is small enough, the terms 7g (u) are smaller than 

( n /)-i/a+o(l) j owing to ^27b. T hi s 



fact will allow us to gain some significant factors when applying Theorem 3.1 



Observe that if | (u, rj(M n )) | < f3 k and if u' is an approximation of u such that \v,i — u'j\ < 0k 
for all i, then 



vr/3 fc (u) = Y\ supP Xii ... jXno (\uiXi H h u no x no -a\< /3 k 

l<-i<no—n' a 

< Yl SU P F xu...,a: no \\ u 'i x i H \-u' no x no - a\ < (n(n B+1 + n a )f3 k + (3 k 

l<i<no— n 

= J] sup P Xl ,..., XnQ (\u' iXl + ■■■ + u' no x no -a\< (n B+2 + n a+1 + l)/3 fc ) 

l<i<no— n 1 a 

< Y\ supP^,...,* (\uiXi H h Uno^no - a| < (n B+2 + n Q+1 + l) 2 /3 fc 

l<i<rao— Ji' 

= 7r &+i( u )- 



(28) 



Naturally, we hope that after the approximation P( n B+2 +na +i 1 )^ fc (u / ) does not increase much 
compared to the original (u). That motivates us to consider a special radius (3 ko with 
respect to u defined below. 



Note that the bounded sequence 7rg fe (u) increases with k, and recall that 7rg (u) > n 
for u € B. Thus, by the pigeonhole principle, there exists ko := &o( u ) — Ce _1 such that 



-Cn 



vr &0+1 (u) < n en vr &o (u). 



(29) 



It is crucial to note that, since A was chosen to be sufficiently large compared to Os, e (l) 
and C, we have 



Pk 0+ i < n B -\ 
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Having mentioned the upper bound of (u), we now turn to its lower bound. Because of 
Condition [TJ and Ui < 1 for all i, and by pigeonhole principle, the following trivial bound 
holds for any j3 > Po and i < no — n' , 



7?(u) > (3n- B ~ 2 > f3 n- B ~ 2 = n - A ' 2+0 ^^ . 
Subclasses of u in terms of the sequence ( 7 «(u)). Set 



/ := [ n -^/2+O s , e (l) jn -l/2 + e/2 +0 (l) ] . = [Z/jr/] _ 

We next divide it into K = (A/2 + B ,e{l))e~ l sub-intervals I k = [hn ke , if +1)e ]. For short, 
we denote by l k the left endpoint of each 1^. Thus l k = n~ A l 2+ ° B A^> +ke , 

With all the necessary settings above, we now classify u basing on the distribution of the 
7^ (u), 1 < i < no — n 1 ~ e . 



For each < &o < Ce 1 and each tuple (mo, . . . ,rriK) satisfying mo + - • • + mK = no — n', we 

o,..,m K ) ^ 
k (u) = k 



let 5^o-- m ^) denote the collect ion of those u from B2 that satisfy the following conditions. 



There are exactly m^ terms of the sequence (7!^ (u)) that belong to the interval 1^ 



In other words, if mo + • • • + m^._i + 1 < i < mo + • • • + m^ then (u) G 7^.. 



The approximation. Now we will use Theorem 3.1 to approximate u G j g( m o,— > m x) ag 
follows. 

• First step. Consider each index i in the range 1 < i < tuq. Because 7^ G Iq, we 



apply Theorem 3.1 to approximate Uj by u'j such that \ui — u'^ < (3k and the v! i 
belong to a GAP Q of rank Bje (l) and size O(7 _1 /v / n 7 ) = O^/n 1 / 2 - 6 ) for all 
but n 1_2<E indices z. Furthermore, all have the form (3k ■ (- + v 7 — l^r), where 

H M, b'l, k'l = 0(4-;) = o(n A / 2 +°*.««). 

k-th step, 1 < k < K. We focus on i from the range no + • • - + rik-i + 1 < i < 

no + • • • + nfc- Because 7i no+ ' + ™ fc - 1+1 ) g /■ we apply Theorem 3.1 to approximate 

Ph o I I 

«i by u[ such that |uj — n'J < /3fc and the u[ belong to a GAP Qk of rank 0^ e (l) 

and size 0(7/7 1 /n 1 / 2 ~ t ) for all but n 1_2e indices i. Furthermore, all v! i have the form 

&o • (p/9 + ^lp'/q'), where |p|, \q'\ = 0(n^ 1 ) = 0(n^ 2+0 ^«). 

For the remaining components m, we just simply approximate them by the closest 
point in (3 io ■ (Z + \f-LZ). 
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We have thus provided an approximation of u by u' satisfying the following properties. 



(i) \ui — u[\ < f3k for all i. 

(ii) v! i 6 Qk for all but n 1 ~ 2e indices i in the range mo + - • • + m/ c _i + l < i < mo + - • ■ + mk- 
(hi) All the u[, including the generators of Qk, belong to the set /3fc -{p/<7+\/— Tp'/Vj \p\, \q\, \p' 

n A/2+O s , £ (l)|_ 

(iv) Qk has rank (9b j(E (1) and size \Qk\ = O^^ 1 /n l l 2 ~ e ). 

Property of u'. Let £j'(^ ll '---' mx ) j-, e the collection of all u' obtained from u 6 £>(™i>---> m if ) 
as above. Observe that, as |(u, rj(M n ))| < /3fc for all i < n— (3b j£ (1), we have 



\(v!MM n ))\<{n B+2 +n a+1 + l)h . 



(30) 



Hence, in order to justify Theorem 2.5 in the case u £ ,62, it suffices to show that the 



probability that (30) holds for all i < n — Ob i<; (1), for some u' G B 



,(mi,...,rn K ) 
k 



, is small. 



Consider a u' G fft™ 1 '---'" 1 ^ anc i the probability P( n B+2 +nQ +i +1 ) /3frn (u ; ) that (30) holds for 



all i < n — Ob j(E (1). By the discussion in 28, we have 



(„S+2 +n a+l + l )/3fco I 



\ n B+2 +na+ i +1)Pko {W) < vr &o+1 (u) < n en 7r &o (u) 



where in the second inequality we used ( 29 ) . 



We recall from the definition of fi^™ 1 '---'" 1 *') that 



K 



Hence, 



K 



k=l 



k=l 



% ( u ^nci=» e(mi+ - +mfe) nr 



k=i 



K 

P( n s+2 + „ Q +i + i) /3fc() (u / ) < n 2en J| l™ k 



(31) 
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The size of Q f ^' 1 >—> mK \ i n the next step of the argument, we bound the size of Q'^' 1 '—' mK \ 
Because each Qk is determined by its Ob j6 (1) generators from the set /0fc o "{f +*f) r > M> M> b'l' 1^1 — 
n A/2+o BiE (i)|^ an( j ^ g di mens ions from the integers bounded by n® 3 ^^, there are n° A - B ' t ^ 
ways to choose each Qk- So the total number of ways to choose Q\, . . . , Qk is bounded by 

( n OA,B, e (l))^ = n O A ,B, e (l)_ 

Next, after locating Qk, the number A/i of ways to choose from each Qk is 



K 

l-2s 



mfr—n 

k\ 



fc=l v 7 

x 

< 2 mi +- +m ^ JJ \Q 

k=l 

<(o(i)rn 7 r mft /n (i 
fc=i 

<n 7 rvn (i 



K 

, \ m k 
k\ 

k=l 

K 

/2— e)(roiH hm fc ) 



fc=l 

K 

/2-e-o(l))n 



fc=l 



where we used the bound = 0( r y k 1 /n 1 / 2 e ) for each 

The remaining components u[ can take any value from the set (3k "{^+2^7, \p\, Mi Ip'I) |</| < 
n A/2+o B>t (i)|^ go ^- ne numDer _/V 2 f W ays to choose them is bounded by 



Mi < ( n A+ ° B ^) 2ni+Kn = n ° A < B ^ n ). 
Putting the bound for A/"i and N2 together, we obtain a bound M' for J^'^ 1 ! 



■A/" < II t mfe /^ (1/2 " e ~° (1))n - (32) 



fc=l 



Closing the argument. It follows from (31) and (32) that 



e p^-^+dao m < n2en n c n irv* ( 



(l/2-e-o(l))n 



, cR/ (™lv,™J{) fc = l fc=l 

< re -(l/2-36- (l))n_ 
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Summing over the choices of ko and (mi, . . . , tuk) we obtain the bound 



2^ P (n B+2 +n a +1+ i )jSfco (uj < n ^ , 



completing the treatment for incompressible vectors, and hence the proof of Theorem 2.5 



7. Proof of the elliptic law, Theorems 1.5 and 1.7 



This section is devoted to the proof of Theorems 1.5 and 1.7 We introduce the following 
notation. Given a n x n matrix A n , we let pA n denote the empirical measure built from 
the eigenvalues of A n and va„ denote the empirical measure built from the singular values 
of A n . That is, 



i<n 



and 



i<n 



where Xi(A n ), . . . , X n (A n ) are the eigenvalues of A n and ai(A n ) > • • • > a n {A n ) are the 
singular values of A n . 



In order to prove Theorems 1.5 and 1.7 we will show that, with probability one, 



V^(x n +F n ) ^ ( 33 ) 

as n — > oo, where \x p is the uniform probability measure on the ellipsoid E p . In particular, 
(33) implies the almost sure convergence of the ESD of -^(X n + F n ) to the elliptic law with 



parameter p. 

To this end, let V(C) be the set of probability measures on C which integrate log | • | in a 
neighborhood of infinity. If \i G V(C), we define the logarithmic potential to be the function 



Up(z) :-- 



/ log \z — A|d//(A). 
Jc 



We will make use of the following uniqueness property [U Lemma 4.1]: if fx, v E V(C) and 
Ufj,{z) = U v (z) for a.e. z G C, then (j, = v. 

We say a Borel function / is uniformly integrable for a sequence of probability measures 

{^n}n>l if 



lim sup / \f\dpn = 0. 
n>l J{\f\>t} 



For a complex n x n random matrix A n , there is a connection between the measure [iA n 
and the family of measures {vA n -zl} «eC- I n particular, 

1 f°° 
U PAn (z) = -— logdet(^ n - zI)*{A n -zl) = - \og{s)dv An -zi{s). 

Jo 
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We refer the reader to the survey [1] for more details. A key tool in the proof of Theorems 



1.5 and 1.7 is the following result from [3]. 



Lemma 7.1 (Hermitization lemma, pfj). Let {A n } n >i be a sequence of complex random 
matrices where A n is of size n x n for every n > 1. Suppose that there exists a family of 
(non-random) probability measures {v z }zec such that for a. a. zdC, a.s. 



(i) VA n -zl ~^v z as n -> oo 
(ii) log is uniformly integr able for \vA n -zi\n>\- 



Then there exists a probability measure p £ V(C) such that 



(i) a.s. fiA„ — > fJ> as n — > oo 

(ii) for a. a. z £ C, 

/>oo 

Upiz) = - log(s)du z (s). 
Jo 

Remark 7.2. Since the singular values (and eigenvalues) of {A n — zI)*{A n — zl) are just 
a\ (A n — zl) , a\ (A n — zl) , . . . , o\ (A n — zl) , it follows that 

V(A n -zl)*(A n -zl)(-oo,x) = VA n -zl(-oo,-\/x) 

for all x > 0. As a consequence, Lemma \7.1\ can be equivalently formulated with the family 
of measures {j y ( J 4 n - 2 /)*(A n -z/)}zeC rather than {vA n -zi}zeC- We will take advantage of this 
fact below. 



In conjunction with Remark 7.2, we define the matrix 

1 



X n - zl 



1 



At, 



Zl). 



n 



For our purposes, we will need to show that the limiting measure fx G V(C) in Lemma 7.1 is 
given by \i p . Fix — 1 < p < 1. We say the family of measures {v z } z ec determine the elliptic 
law with parameter p by Lemma |7.1| if 



U^iz) = - / \og(s)dv z {s) 
Jo 

for all z E C. The existence of this family of measures was verified and used in |29| . 



oo 



The key tool we use to prove Theorems 1.5 and 1.7 is the following comparison lemma 



Lemma 7.3. Let < p < 1 and —l<p<lbe given. Let {X n } n >i and {Y n } n >i be 
sequences of random matrices that satisfy condition CO with atom variables (^1,^2) & n d 
(VI1V2), respectively. Assume (^1,^2) and (771,7/2) are from the (p, p) -family. Assume for 
a. a. z G C that a.s. 



as n —> 00 for a family of deterministic measures {v z } z ec- Assume {F n } n >i is a sequence 
of deterministic matrices such that rank(i ? n ) = o(n) and sup n -\H-fnll2 < 00 • Then a.s. 



as n — > 00 . 
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Lemma 17.31 is useful when we know the limit of u 1 v . For our purposes, we will take 

{Yn}n>i to be a sequence of matrices that satisfy condition CO with jointly Gaussian 
entries. In the real case, the limiting ESD of ^Y n was computed in [29j . 



We divide the proof of Theorems 1.5 and 1.7 into a number of lemmas organized below by 
sub-section. 



'1) In order to apply Lemma 7.1, we need to show that log is uniformly integrable for 



7.4 



The arguments 



zI ]n>i- We prove this statement in sub-section 

in this section are based on [U [29], 142] . We will also require the use of Theorem 2.1 
to control the least singular value. 



(2) In sub-section 7.12 we prove a replacement lemma using a moment matching argu- 
ment. The lemma will allow us to compute the limit of v j_ x _ , by comparing the 

Stieltjes transform of this measure to the corresponding Stieltjes transform in the 
Gaussian case. In order to prove this lemma, we will first need to bound the vari- 



7.10). 



ance of the resolvent (sub-section 7.8) and apply a truncation argument (sub-section 



Lemma 7.3 to prove Theorem 1.5 



(3) In sub-section 7.15 we prove Lemma 7.3 We then apply the results of [29] and 



(4) In sub-section 7.16 we prove Theorem 1.7 



7.4. Uniform Integrability. In this sub-section, we prove the following Lemma. 

Lemma 7.5. Let < p < 1 and —l<p<lbe given. Let {X n } n >i be a sequence of 
random matrices that satisfies condition CO with atom variables (^1,^2) from the (p, p)- 
family. Assume {F n } n >\ is a sequence of deterministic matrices such that rank(i ? n ) = 
o(n) and sup n ^H-Fnll^ < °°- Then for a. a. z G C a.s. log is uniformly integrable for 
{v_^( Xn+ F n )-zl}n>l- 



The proof of Lemma 7.5 is based on the arguments of [H [291 02]. I n order to prove Lemma 



7.5, we will need the following bound for small singular values. 



Lemma 7.6. There exists Co > and < 7 < 1 such that the following holds. Let {X n } n >\ 
be a sequence of random matrices that satisfies condition CO. Then a.s. for n ^> 1 and for 
all n 1 " 7 < i < n — 1 and all deterministic n x n matrices M , 

o- n -i{n- l l 2 X n + M) > c -. 

n 

Proof. Let o\ > 02 > • • • > o n denote the singular values of A = A=X n + M. It suffices to 

prove the lemma for 2n 1_7 < i < n — 1 for some < 7 < 1 to be chosen later. Let A' be 
the matrix formed from the first m = \n — i/2] rows of ^/nA. Let a[ > ■ ■ ■ > o~' m denote 
the singular values of A'. From eigenvalue interlacing it follows that 

— °~n—i — °~n—i- 

n 
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By [121 Lemma A. 4], 

cr'f 2 + • • • + a'~ 2 = distj" 2 + • • • + dist~ 2 
where distj = dist(rj, Hi), r% is the i-th row of A', and 

Hi = Sparer, : j = 1, . . . m; j ^ i}. 

Since 



-2 ^ 1-2 



it follows that 



1 ^<\<^< £ <^£ dist 7 2 - 



2n 



(34) 



j=n-i 



We now wish to estimate dist (rj, Hj). However, rj and Hj are not independent. To work 
around this problem, we define the matrix A'- to be the matrix A' with the j-th column 
removed. Let Yj be the j-th. row of A'- and let 

Hj = Span{r k (A'j) : k = 1, . . . , m; k ^ j}. 

Note that Yj and Hj are independent for each j = 1, . . . , m. 

We also have 



dist(r,- ; Hj) = inf \\r 



veHj 



v\\ > inf liyj — v\ 



distm,^') 



where 



dim(Fj) < dim(iT,) <n-l--<n-l-(n-l 



CO. 



By Lemma 7.7 below and the union bound, we obtain 

oo / n m \ 

E p U U{ dist ^ c ^} < 

n=l \i=2n 1 -Tj=l / 

Thus, by the Borel-Cantelli lemma, for all 2n 1-7 < i < n — 1 and all 1 < j < m 

distj > coVi a.s. 



The proof of Lemma 7.6 is then complete by the above estimate and (34). 



□ 



Lemma 7.7 (Distance of a random vector to a subspace). Let x and y be complex-valued 
random variables with unit variance. Then there exists 7 > and e > such that the 
following holds. Let (£1, £2, • • • > £n) be a random vector in C n with independent entries. 
Assume further that for each 1 < i < n, £j is equal in distribution to either x or y. 
Then for all n 3> 1, any deterministic vector v £ C n and any subspace H of C n with 
1 < dim(H) < n — n 1-7 , we have 



P ^dist( J R,F) < ^ v / n-dim(i7)^ < exp(- 



-n 



where R = £ 2 , ■ • • , U) + v. 
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Proof. Let H' be the subspace spanned by H, v, and E[i£]. Then dim(H') < dim(H) + 2 
and dist(i?, H) > dist(i?, H') = dist(i?', H') where R' = R- E[R\. Thus it suffices to prove 
the lemma when v = and E[x] = E[y] =0. 

We now perform a truncation. By Chebyshev's inequality, 

P(|6I > n £ ) < n~ 2e . 
Furthermore, by Hoeffding's inequality 



P Yl 1 {I6I<™ e } < n - nl £ < exp(-n 



l-2e>, 



\i=l / 

where we take e G (0, 1/3). Therefore we will prove the lemma by conditioning on the event 

fim = {|&| <n £ ,...,|e m | <n e } 

with m = \n — n 1_e ] . 

We now deal with the fact that on the event O m , the random vector (£i, . . . , Cm) may have 
non-zero mean. Let E m denote the conditional expectation with respect to the event fi m 
and the cr-algebra F m = cr(£ TO+ i, . . . , £ n ). Let W be the subspace spanned by H, u, and w 
where 

u = (0,...,0,£ m+ i,...,£ n ), u> = (E m [£i],...,E m [£ OT ],0,...,0). 
Clearly is J-* m -measurable. Moreover, dim(M /r ) < dim(H) + 2. Define 

y = (ei-E4£ 1 ],...,£ m -E m [6 n ],0,...,0) = i?-u-™. 

Then dist(i?, -ff) > dist(i?, W) = dist(Y, W). By construction each entry of Y has mean 
zero. Since each entry of the original vector R is equal in distribution to either x or y, it 
follows that 

sup \af — 1| = 1 — o(l) 

l<i<m 

where a 2 = E m |Y;| 2 . 

By Talagrand's concentration inequality [37], 

P m (|dist(y, W) - M m \ >t)< 4exp (-^J) (35) 



where M m is the median of dist(Y, W) under O m . Using (35) one can verify that 



M m > \J E m dist 2 (y, W) - Cn A£ 

for some positive constant C (see for instance [451 Lemma E.3]). Let P denote the orthogonal 
projection onto W^. Then 

m / n n \ 

E m dist 2 (Y,W) = J2^m[Y k 2 ]P kk >c l^Pkk- Yl 

k=l \k=l k=m+l / 

> c(n — dim(H) — (n — to)) 

for any 1/2 < c < 1 and n 3> c 1. Thus 

M m > c^Jn - dim(if) 

for n sufficiently large. Finally, we choose < 7 < e/2 and the proof of the lemma is 
complete by taking t = (c — 1 /2) \Jn — dim(ff) in (35). □ 



THE ELLIPTIC LAW 



33 



We now prove Lemma 7.5 



Proof of Lemma 7.5. By Markov's inequality, it suffices to show that there exists p > 
such that for a.a. z S C a.s. 



n— >oo 

Fix z G C. Then 



limsup / s v dv i x _ zI < oo and limsup / s p dvj_ x _ zI < oo. 



s p disj_ x ,j_ F j < 1 + -tr \^=X n + — 



n \ n 



for p < 2. We expand out the right-hand side and consider three separate terms. First, by 
the law of large numbers, 



1 



-tr 



n 



1 



X n - zl 



n 



1 



n x n - zi j < i + 1 £ ni 2 - 2Re ( ^2 E *** ) + i*i 

i,i=l \ fc=l / 



— ► 2 + 

a.s. as n — > oo. Here, we first divide the sums into three parts in order to apply the law 
of large numbers. The first when i < j, the second when i > j, and the third when i = j. 
In this way the summands in each sum are i.i.d. random variables and the law of large 
numbers applies. 

Second, 



-tr 



n 



1 + 

n \/n 



Fn-Zl 



l + M 2 ,, „2 



1 



n- 



tr(F*X n ) 



Since sup n ||-Pn||| < oo by assumption, it suffices to show that limsup n _ s>00 ^tr(F*X n ) < 
oo a.s. We apply the bounds 



ii- 



MKXn 



< — 9 H-P'rilbll-X'nlh < - oll-fnlli ^ dl^nlll- 



By the law of large numbers (again considering three separate terms), it follows that a.s. 

limsup — g tr(X*X n ) < oo. 

n— ¥oo Tl 



Similarly, for the third term, we have that a.s 

1 



lim sup 

n— »oo 



-tr 



71 



L X n + ^=F n -zI 
n vn 



F r , 



71 



< OO. 



We simplify our notation for the remainder of the pro of an d write o~\ > o~i > ■ ■ ■ > a n for 
the singular values of A=(X n + F n ) — zl. By Theorem 2.1 we have that for some A > 0, 



a„ > n a.s. 
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Thus, 



i=l 



_. n 1 n—n 7 n 

-E^<- E ^ rP +- E ^ 7P 

i=l i=r. 
1 - 

-E 



c 



i=l 



?' 



+ n 



a.s. by Lemma 7.6 The remaining sum is just the Riemann sum of the integral u p du. 
Therefore, we have that 



1 n 

-E^r* 



< oo a.s. 



i=l 



for p < min{l, 7/A}. 



□ 



7.8. Variance Bound. In this sub-section, we prove the following lemma. 

Lemma 7.9. There exists a positive constant C such that the following holds. Let {X n } n >\ 
be a sequence of random matrices that satisfies condition CO with atom variables (^1,^2)- 
Define 



R n :- 



11 



Xn-zl 



H n {a) := (R* n R n - al)' 1 



where a 6 C with Im(a) 7^ 0. Then 



E 



uniformly for z € C where 



-tvH n (a) - E 



n 



-trH n (a) 



n 



4 c 4 
n z 



1 \a\ 
+ 



|Im(a)| |Im(a)| 2 



Moreover, for every fixed a, 



1 



n 



trff n (a) = E 



n 



tiH n (a) 



+ 0(n^ 1 / 8 ) a. 



s. 



uniformly for z £ C. 



(36) 



(37) 



Proof. Let E<^ denote conditional expectation with respect to the cr-algebra generated by 
ri(X n ), . . . ,r k (X n ),ci(X n ), . . .,c k (X n ). Define 

Y k := E< fc -tr H n (a) 
~ n 

for k = 0,1,..., n. Clearly {lfc}^ =0 is a martingale. Define the martingale difference 
sequence 

ctk ■= Y k - Y k -i 
for k = 1,2, ... ,n. Then by construction 



n 1 1 

V«k = -trH n {a) - E-trF n (a). 



k=l 
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We will bound the fourth moment of the sum, but first we obtain a bound on the individual 
summands. Let X nk denote the matrix X n with the A:-th row and /c-th column replaced by 
zeros. Let 



R n ,k ■= —j=X nk - zl, H n k (a) := (R* n k R n ,k - al) 1 . 



It follows that 



E <k -trH nk (a) = E< fc _i-tril n k (a) 
~ n ~ n 



and hence 



-trH n (a) - -tvH nk {a) 
n n 



E 



<k-l 



-tr H n (a) - -tr H nk (a) 
n n 



By the resolvent identity, 

\%rH n (a) -tTH n>k (a)\ = \tr[H n (R* n R n - R* n>k R n ^)H n ^\. 
Since R* n R n — R* n k Rn,k is at most rank 4, it follows that 

\trH n {a) - trH n>k (a)\ < 4\\H n (R* n R n - R* n<k R n>k )H n , k \\. 
We then note that 

t on o 

\\H n (a)R* n R n \\ < sup r < 1 + sup ' ' < 1 + 

since the eigenvalues of R* n R n are non-negative. Similarly, 

* I ck | 

\\H nik (a)Rn ik R n ,k\\ < 1 + 



|Im(a) 



|Im(a) 



Since we always have the bound ||-ff n (a)|| < |Im(a)| , it follows that 

\trH n (a) - tr H nik (a)\ < 8c a . 

Thus we conclude that 



\oc k 



< 



16c n 



n 



By the Burkholder inquality (see [3j Lemma 2.12] for a complex martingale version of the 
Burkholder inequality), there exists an absolute constant C > such that 



E 



n 
k=l 



< 



He 



\k=l 



■y \ 16 2 C^ 2 



< 16 4 C^. 
n z 



The proof of (36) is complete. 



To prove (|37|), we use Markov's inequality and (|36|) to obtain 

1 



1 



-tr H n (a) - B-tr H n (a 



>e < C 



n 2 e 4 ' 



n n 

The result follows by taking e = n -1 / 8 and applying the Borel-Cantelli Lemma. 



□ 
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7.10. Truncation. Given a sequence of random matrices {X n } n >i that satisfies condition 
CO, we define the sequences {X n } n >i and {X n } n >i where for each n > 1, X n = {xij)i<i,j<n 
and X n = (xij)i<ij< n with 



Xij 



x 



i i 1 {|a:<j|<n a } ~ ^ , [ x ij 1 {\x i j\<n l >}\-> « ^ J 
0, i=j 



and 



2 

0, i = j 

for some <5 > 0, which we will choose later. For each n > 1, define the matrices 



and 

11, - ( -^X n - zl) (^=X n -zl) . 
in I \ \ n 



We let L(n, u) denote the Levy distance between the probability measures \x and v. We 
prove the following truncation lemma. 

Lemma 7.11. Let {X n } n >i be a sequence of random matrices that satisfies condition CO. 
Then uniformly for any \z\ < M, we have that 

Moreover, 

E[Re(x ij ) k Im(x ij ) l Re(x ji ) m lm{x ji ) p } = E[Re{xi j ) k lm(xi j ) l Re(x ji ) m lm{x ji ) p ] + o(l) (38) 
uniformly for i ^ j and all non-negative integers k, I, m,p such that k + l + m + p< 2. 

Proof By [3, Corollary A.42], 

L \»H n , »hJ < §s [ tr (#n + H n )tr ((X n - X n f {X n - XrS)] ■ (39) 

By the law of large numbers, 



1 



1 z \ 

trH " = ^E ni 2 - 2Re Y. Xkk + \ z \ 



n n* * — ' ' "' \ n 3 / 2 

«J = 1 \ k=l / 

a.s. as n — > oo. Here, we first divide the sums into three parts in order to apply the law of 
large numbers. The first when i < j, the second when i > j, and the third when i = j. In 
this way the summands in each sum are i.i.d. and the law of large numbers applies. 

Similarly, 

-trH n — > 1 + \z\ 2 
n 

a.s. as n — > oo. 
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For the remaining terms, we note that 



1 



n- 



tr ((X n - X n )*(X n - X n )"j <^2 J2 [ 



l<i<j'<n 

2 



+ 



n- 



E [i 



— V 

n 2 ^ 



l<i<i<« 
n 



i=l 



By the law of large numbers, each sum on the right-hand side converges to zero a.s. as 

(40) 



n — > oo. Combining these estimates into (39), yields 



L (yH n ,Vjj n ) = o(l) 



a.s. as n — > oo. 

Again using [31 Corollary A. 42], 



L ^h„^hJ ^ Z3 tr (^« + An)tr ( (X n - X n f(X n - X n 



??•' 



It then follows that 



a.s. since 



(41) 



I- ^|^x ij \' 1 = o{l) 



uniformly for all i ^ j ; by the identical distribution assumption of condition CO. The result 
then follows from estimates (40) and (41). 



(38) can be obtained from the dominated convergence theorem; the identical distribution 
portion of condition CO gives uniform control for all i ^ j. □ 



7.12. Replacement. In this sub-section, we prove a comparison lemma based on moment 
matching. We begin with a definition. 

Definition 7.13 (Moment matching). Let (£1,^2) and (771,772) be two random vectors in 
C 2 . We say that (£1,62) and (771,772) match to order k if 

E[Re(e 1 ) i Im(e 1 VRe(6)M6) m ] = E[Re(77 1 ) i Im(77 1 V'Re(r ?2 )'lm(77 2 ) w ] 

for all non-negative integers I, m with i + j + I + m < k. 



The goal of this sub-section is to prove the following lemma. 

Lemma 7.14. Let {X n } n >i and {Y n } n >i be sequences of random matrices that satisfy con- 
dition CO with with atom variables (£i,£ 2 ) and (771,772), respectively. Assume the moments 
of (£i,£ 2 ) and (771,772) match to order 2. Then for a. a. z £ C a.s. 

as n — )• 00 . 
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We proceed using the Stieltjes transform. For a n x n matrix A, we define the matrices 

R n (A) = -^=A - zl, G n (A) = (R n (A)*R n (A) - a/)" 1 
\ n 



where z, a £ C with Im(a) > 0. Using the resolvent identity, we can compute 



d(G n (A))j 
dRe(A 



St ) 



n 



[(G n (A)R n (A)*) is (G n (A)) t j + (G n (A)) it (R n (A)G n (A)) S j] (42) 



and 



d(G n (A))j 
31m(A 



St) 



[{G n {A)R n (A)*) is {G n (A)) tj - (G n (A)) it (R n (A)G n (A)) sj ] . (43) 



Fix the indices a ^ b. Let V\ = e a e* b and V 2 = e^e* where e%, ... ,e n is the standard basis 
in C n . Let xi,X2,X3,Xi be real variables. We define the function 

1 



/(xi,x 2 ,x 3 ,x 4 ) = -trG n (A + xiVi + y/-lx 2 Vi + x 3 V 2 + \/-lx A V 2 ). 
n 



Using the derivatives above, we write out the power series 

4 df 4 
/(xi,x 2 ,x 3 ,x 4 ) = /(0, 0,0,0) + ^-(0,0,0, 0)x fc + 



k=l 

where |e| < CM(|xi| 3 + |a?2 1 3 + | ^3 1 3 + |^4| 4 ) with M defined by 

d 3 f 



dxidxj 
i,j=l J 



(0, 0, 0, 0)xiXj + s (44) 



M = sup sup 

l<i,j,k<4 xi ,x 2 ,X3,Xi 



dxidxjdxk 



(x 1 ,X 2 ,X 3 ,X 4 ) 



We now obtain a bound for M and the partial derivatives of /. Note that the bounds we 
derive below hold uniformly for any matrix A. We can write R n {A) = U \J R n {A)* R n {A) 
where U is a partial isometry. So 



1^04)6^)11 < \\U^R n {A)R n {A){R* n R n (A) - al)~ l \ 
< yR n {A)R n (A){R* n R n (A) - aiy l \\ 



and similarly 



< sup 

t>o 



t — a 



< 



It / M +SUP 

|Im(a)| 4 >o 



t — a 



< 1 + 



\a\ + 1 
|Im(a)| 



|G n (A)i? n (yir|| < 1 + 



\a\ + 1 

IM«)I 



Thus, by (42), (43), and the bounds above, it follows that 

d 2 f 



dx k \n 



On 



dxkdxi \ 11 

uniformly for 1 < k < 4, any xi,x 2 , x%, 24 G M, and any A. 



93 f = p a ( 1 

dxkdxidxj \n 



(45) 
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We are now ready to prove Lemma 7.14 By Lemma 7.11 and Remark 7.2, it suffices to 
show that a.s. 

u Rn(X n )*Rn(X n ) ~ u R ri (Y n )*R n (Y n ) y °- ( 46 ) 
as n — > oo. So, without loss of generality, we assume £i, £2, 771, 772 have mean zero, unit 
variance, and are bounded almost surely in magnitude by ra" 5 for some < 5 < \. 



By Theorem B.9], we can equivalently state (46) as 



1 



n 



tiG n (X. 



1 



hi t r G n (y n ) — > o 

n 



a.s. for each fixed a with Im(a) > 0. However by Lemma 7.9 this reduces to showing that 

E-trG n (X n ) - E-trG n (y n ) — ► 0. (47) 
n n 



n 



n 



n 



n 



ii 



We will verify ( |47[ ) by showing that for each fixed a with Im(a) > 0, 

E / ( —> —i ~ ) =E /( —i — ) —)+°a( n ) 

where we take A to be any matrix independent of (£1,62) an d (771,772). Indeed, by allowing 
a and b to range over all 0(n 2 ) indices, and by the triangle inequality, we obtain 

E-trG n (X n ) = E-trG„(Y n ) + o a (l) 
n n 

as desired. 



It suffices to verify (48) for the off-diagonal entries (a ^ b). Indeed, all diagonal entries are 



assumed to be zero by our previous application of Lemma |7.11 



Using (44), (45), and the independence assumption from condition CO, we obtain 
E[f(x 1 ,x 2 ,x 2 ,x i )} = E[/(0, 0,0,0)]+ Y, E— J-(0,0,0,0)E[^]+E[e] 



where 



and 



x-i 



Re(£i 



n 



X2 



Im(6) 



?? 



^3 



Re(6 



11 



X4 



Im(6 



n 



Eld = O n 



n 

"X5 



for some < 5 < 1/2 from Lemma 7.11 



We repeat the same procedure for (7/1,772) and obtain 

; d*f 



E[/(yi, y 2 , y 2 , Va)] = E[/(0, 0,0,0)]+ ^ E— ^-(0, 0, 0, 0)B[y iyj ] + O a 



71 



7? 



where 
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By (38), we have that E[xjXj] = E^y^] + o(n 1 ) uniformly for 1 < i,j < 4. Combining 
this with (45) yields 

E[/(yi,2/2,y2,2/4)] = E[/(xi,X2,X2,X4)] + o a {rT 2 ) 



and the proof of Lemma |7. 14 is complete. 



7.15. Proof of Lemma 17.31 and Theorem 11.51 This sub-section is devoted to Lemma 



7.3 and Theorem 1.5 The proof of Lemma |7.3| relies on Lemmas 7.14 7.5 and 7.1 



Proof of Lemma 7.3. Assume p, p, {X n } n >i, {Y n } n >i, {F n } n >i satisfy the assumptions in 



the statement of Lemma 7.3 By [3j Theorem A. 44], it follows that for a. a. z £ C a.s. 



as n — > oo. Since both (£1,^2) and (r]i,r]2) are from the (p, p)-family, then (^1,^2) and 
{VI1V2) match to order 2. Thus by Lemma 7.14 for a. a. z£C a.s. 



v 1 



X„-zI 



as n — > 00. Therefore, for a. a. z € C a.s. 



VJ^ {Xn +F n )-zI 



as n — > 00. 



Furthermore, by Lemma 



7.5 



for a. a. z £ C a.s. log is uniformly integrable for {y 1 | f„) zl) 
{vj^x -zAn>i, and {y j_ Y _ 2 7-}n>i- The result then follows by Lemma|7.1|and the unique- 
ness of the logarithmic potential [4., Lemma 4.1]. □ 



We can now prove Theorem 1.5 



Proof of Theorem 1.5 Let {X„} n >i be a sequence of real random matrices that satisfies 
condition CO with atom variables (^1,^2) and p = E[^i^] for some —1 < p < 1. Let 
{Y n }n>i be the sequence of random matrices that satisfies condition CO with atom variables 
(771,772) where 771 and 772 are jointly Gaussian and £[771772] = p. In [291 Theorem 5.2], it is 
shown that 

Ez/ 1 v * — > v z 

as n — > 00 where the family {^} z ec determines the elliptic law with parameter p by Lemma 



7.1 In fact, using the variance bound in Lemma 7.9 and [3] Theorem B.9] it can be shown 
that a.s. 



V ±Y n -zI — > "* 



as n — > 00. By Lemma 



7.5 



for a. a. z £ C a.s. log is uniformly integrable for {v 1 Y 2 j}«>i 



and hence by Lemma |7.1| we conclude that 

pj_ Y 

a.s. as n — > 00. 



Since both (£1, £2) and (771, 772) are from the (1, p)-family, the proof of the theorem is complete 
by an application of Lemma |7.3[ □ 
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7.16. Proof of Theorem 1.7, In the proof of Theorem 1.5 above, we relied on the previous 
results in [29], where the entries are assumed to be real. In order to prove Theorem 1.7 we 
first need to study the complex Gaussian case. 

Lemma 7.17. Let < p < 1 and — 1 < p < 1 be given. Assume {X n } n >i is a sequence 
of complex matrices that satisfy condition CO with atom variables (£1,^2) from the (p,p)- 
family, where Re(£i), Im(£i), Re(^2) 5 Imfe) are jointly Gaussian. Then for a. a. z £ C 
a.s. 

—7=Xn—Zl z 



as n —> 00 where {v z }zeC determines the elliptic law with parameter p by Lemma 7.1 



Let us assume Lemma 7.17 for now and complete the proof of Theorem 1.7 



Proof of Theorem \l. 7[ Let {X n } n >i be a sequence of complex random matrices that satisfy 
condition CO with atom variables (£1,^2) from the (p, /^-family. Let {Yn} n >i be the se- 
quence of complex random matrices that satisfy condition CO with atom variables (771,772) 
from the (p, /^-family, where He(rji), lm(rji), Re^), 1111(772) are jointly Gaussian. By Lemma 



7.17, for a. a. z £ C a.s. 



V 1 v «T > Vz 

as n —7- 00 where the family {v z } ze c determines the elliptic law with parameter p by Lemma 
Moreover, log is uniformly integrable for \v j_ Y _ z j} n >i by Lemma 



7.1 



Lemma 7.1 it follows that a.s. 



7.5 



Therefore, by 



MP 



as n — > 00. The proof of Theorem 1.7 is now complete by Lemma 7.3 



□ 



All that remains is to prove Lemma 7.17 Let {X n } n >i be the sequence of random matri- 



ces defined in Lemma 7.17 with jointly Gaussian off-diagonal entries. We follow [29] and 



introduce the following notation. 

For n x n matrices A and B, we define the 2n x 2n bock matrices 



V 











s/n 



B* 



J(z) 



zL 
zl 



and set 

V(z) := VJ-J(z) 
where J := J(l). We let R denote the resolvent of V(z). That is, 

R ■= [V{z) - al}' 1 



for a £ C. 
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Using the resolvent identity, we can compute 

dR ab 1 



dRe(A cd ) y/n 
8R ab v 7 ^! 



dlm.(A cd ) y/n 
dRab 1 



RacRd+ n,bi 
RacRd+n.bi 



dRe(B cd ) y/n~ 
8R ab yf=l 



Ra,d+nRcb) 
Ra,d+nRcbi 



dlm(B cd ) y/n 

for 1 < c, d < n and 1 < a, b < 2n. For the remainder of the paper, we will take A = B = X n . 

We will make use of the multivariate Gaussian decoupling formula [35]. That is, if Y = 
:1 is a real random Gaussian vector such that 

Efe] = 0, = C jk 

for j, k = 1, 2, . . . ,p and if : M p — > C has bounded partial derivatives, then 

p 

E^l^^EKVI),]. 

k=l 

Using the partial derivatives above and the Gaussian decoupling formula, we obtain 

E[R ab x cd ] = -^E[R a4+n R cb ] - -?=E[R ad R c+rijb ) (49) 
In y/n 



and 



H i=^-E[R ac R d+nb ] + e [R ac+n R db ] 

y/n y/n 



E[R ab X cd ] = -^[RacRd+nA ~ -^=E[R a)C+n R db ] (50) 

\/n y/n 



+ J^^ E [R a ,dRc+n,b] H ^=^E[i? a w +n -R c 6] 



for 1 < c, (I < n, c ^ (i, and 1 < a, b < 2n. 

Following [29], we define the functions 

I i n i n 

s n := s n (a,z) = — E[tiR] = TE^] = Te^+J 

i=i i=i 

and 

-y n 1 n 

i n := t„(a, z) = - y~] E[R i+n i ], u n := u n (a, z) = - V] E[R ii+n ] 

n ^-^ n ^-^ 

i=l i=l 



We now fix z, a S C with Im(a) > 0. In the definitions above, we deal with the expectation 
of the summands instead of the random elements. In order to justify this, we need control 
of the variance, which we obtain in the following lemma. 
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Lemma 7.18. 



Var I T^tri? 



O 



a,z \ I i 

n ' 



o 



a,z I j 

n 



n 



(51) 
(52) 

(53) 



Proof. We begin by noting that 



1 n 
n i 



1 



i=l 



n 



ti{P 2 RPi 



and 



where Pi and P2 are partial isometries. Thus, it suffices to prove 

Var Qtr(PPQ)) = O q ,. Q 
for arbitrary partial isometries P and Q. 

Let E<fc denote conditional expectation with respect to the cr-algebra generated by the 
random vectors 

ri(X n ), r k (X n ),ci(X n ), c k (X n ). 

Define 

Y k := E< fe -tr (PPQ) 
- n 

for k = 0,1,..., 2n. Clearly {lfc}|"o is a martingale. Define the martingale difference 
sequence 

ah ■= Y k - 
for k = 1, 2, . . . , 2n. Then by construction 

2n 



Va fe = -ti(PRQ) - E-tr(PPQ). 
f— ' n n 



k=i 



Thus we need to show that 



E 



2n 

£ 

fe=i 



Or 



Again we introduce the notation -X" n) fc to denote the matrix X n with the A;-th row and A;-th 
column replaced by zeros. Let 

1 v 





-^=x 



and define 



-1 



Pfc := [V k J - J(z) - al]-' . 
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B <k -tr(PR k Q) = E <k ^-ti(PR k Q) 
- n ~ n 



-tr(PRQ) - -tr(PR k Q) 
n n 



E 



<fc-i 



-(PRQ) - -tiPR k Q) 
n n 



Because Vk — V is at most rank 4, we have that 

\tr{PRQ) - tr(PR k Q)\ < ±\\R((V k J - J{z)) - [VJ - J(z)))JR k \\. 
We now claim that 

\\R((V k J - J(z)) - (VJ - J(z)))JR k \\ = O a , z (l) 
uniformly in k. Indeed, 

\\R(VJ-J(z))R k \\ < —L-\\R(yj - J(z))\\ < -^sup-^ = a , 2 (l) 



(54) 



|Im(a 

since VJ — J(z) is Hermitian. Similarly, 

\\R{V k J -J{z))R k \\=O a , z {l 



|Im(a)| taw, I* - «l 



and (54) follows. Thus 



atk = O c 



n 



uniformly in k. 



By the Burkholder inequality (see O Lemma 2.12] for a complex martingale version of the 
Burkholder inequality), there exists an absolute constant C > such that 

2 



E 



2n 
k=l 



< 



CEf> fc | 2 = O a , 2 Q). 
k=i v J 



□ 



We are now ready to prove Lemma 7.17 



Proof of Lemma \7.1T\ Fix z,a 6 C with Im(a) > 0. By the resolvent identity, 



We decompose, 



1 + as n = ^EtriRVJ) - |w n - ~t n . 



lmr(RVJ) = \a x + \A 2 



where 



_. n 1 n 

A 1 = -EY i (RVJ) ii and A 2 = -^Y (RV J) i+n 



i+n- 



i=l 



i=l 
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By (50) and Lemma 7.18 we have 



1 n 
1 = ^ E Yl Ri >i+ ni 



1 n 

[RiiRj+n,j+n + pRi,i+nRj,j+n — (1 — 2/i) P-RjJ Ri+n x 



■j+n 



i,j=l 



1 — 2fi)Rij +n Rij +n ] + o QjZ (l) 



= -4-^ + o a , 2 (l). 

For the second line, we used that the diagonal entries i = j give total contribution O a ^ z {n~ l l 2 ) 
which we write as the o QjZ (l) term. For the third line, we used that if 



R 



R\ R2 
R3 Ra 



where R\, R2, R3, Ra are n x n matrices, then 



1 n 

^2 Ri J R 



i+n,j+n 



n- 



MRjRi, 



< -\\R\r < 



n 



1 

n|Im(a) 



and 



1 n 

^2 RiJ+nRi: 



]+n 



1 



tr(i?Jii 2 ; 



< 



n|Im(o! 



12- 



Similarly (using (49) and Lemma 7.18), 

A 2 = -s 2 n - pt 2 n + o atZ {l). 

Thus 

1 + as n = — s n — — u n — —t n — — u n — —t n + o a ^ z {\). 



(55) 



We now obtain an equation for t n . Again by the resolvent identity 



at r . 



^ n \ n 

3/2 E ^ ~] Ri+n,j+n%ij ^72"^ ^ / R'i+n,i+n — A3 ^ri) 



where 



i=i 



1 - 



i+n,j+n%ij ■ 



We repeat almost exactly the same procedure as above (using (50) and Lemma 7.18) and 
obtain 



1 - 

A3 — 77/^ E ^ ^ [Ri+n,iRj+n,j+n ~i~ pRi+n,i+nRj,j+n 



— (1 — 2[l)pRi+ n jRiJ rn jJ rn — (1 — 2//)i?i+ n j+ n -Rij+ n ] + Oa jZ (l) 
tnSn pSnUn ~\~ Oq^I). 
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Again we used the fact the the diagonal entries give contribution oltZ (n~ 1 ^ 2 ) and control 
the remaining error terms by writing each as a trace of products of R\, R2, R3, R/±. Thus 
we conclude 

Ott n — t n S n pS n U n ZS n + Oq,z(1)- 

Similarly, we obtain an equation for u n : 

OLii n — u n s n ps n t n zs n -\- Oq, i2: (1). 



Combining the above equations for t n and u n with (55), we arrive at the following system 
of three equations: 



1 + as n 
at n 
aun 



P -u 2 - P -t 2 
2 n 2 n 



n 2 -n 2 " 2 " 2 
~tnSn PS-n^n ZS n + Oa,z(l) 
~U n Sn PSntn ZS n + Oq. i2 (1). 



tn + Q) z(l) 



(56) 



We note that the system of equations above does not depend on p. In addition to the case 
above, we also consider the case when p = 1. This corresponds to the real Gaussian case 
studied in [29]. Repeating the same calculations as above, we obtain the following system 
of equations in the real Gaussian case: 



1 + as n 



P ul - P tl 



z 



-U r 



Oil,, 



n 2 n 2 n g 2 
~tnSn PSn^n ZS n + O atZ {V) 
~U n S n ps n tn ZS n + Oq, j2; (1). 



t n + 00,2(1) 



(57) 



One can also check that this system matches (5.4), (5.5), and (5.6) from [29,. In |29j, it is 
shown that for every a, z £ C with Im(a) > 0, 



where so = so(a, z) is given by 



•50 



lim s r 

n— >oo 



dvJx) 



x — a 



so 



x + a 



and the family {v z }z£C determines the elliptic law with parameter p by Lemma 7.1 



By Lemma 7.18 and [3j Lemma B.9], it suffices to show that for every a,z G C with 
Im(a) > 0, 

lim s n = so (58) 

n— >oo 



in order to complete the proof of Lemma 7.17 



Since ||i?|| < j^), it follows that |s n |, \t n \, \u n \ < j^^- Similarly, |s n |, \i n \, \u n \ < j^). 
So by Vitali's convergence theorem, it suffices to show that for any fixed z G C, (58) holds 
for any a G C with Im(a) sufficiently large. 
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Taking Im(a) > 1, we subtract the last two equations of (56) and (57) to obtain 



/ \ - - i 1 + p + Im(a)|z| 



Im(a) 2 — 1 
1 



Im(a) 2 — 1 
1 + p + Im(a)|,z| 



•Sri Sn\ ~i~ 0q. ; 2;(1). 



Im(a) 2 — 1 Im(a) 2 — 1 

Taking Im(a) sufficiently large (in terms of p and |z|) we can write (say) 

1 * 1 

\tn t n \ ^ Yoo ^ Too ^ < ~ >a,z ^~' 

1 I " I 1 



and hence 



yy 

2 I 



We now subtract the first equations of (56) and (57) and apply the bounds above to obtain 



8 

yy 

for Im(a) > max{99, \z\}. 

This implies that for any a, z S C fixed with Im(a) sufficiently large, 

s n = s n + o(l) 



and the proof of Lemma 7.17 is complete. 



□ 



Appendix A. Proof of Theorem 13.21 
Our first step is to obtain the following. 

Claim A.l (upper bound for small ball probability). We have 
supP(| ^(a;Xj + bix'i) -a\<r) 

a 

i 

„ n 

< exp(7rr 2 ) / exp(- £ V^,^' ||Re(2(6 - + 2(6 - £)&,•*) [|r/ z - v\t\ 2 )dt, 

Jc i=i 

where ||^||r,/z * s the distance from a real number z to its nearest integer. 



Proof, (of Claim A.l) First of all, we have 
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r 2 ) 



P(^2( ai Xi + bix'i) G B(a, r)) = P(| ai Xi + bix\ - a\ 2 < 

i=l i=l 

= P exp(— 7r| OjXj + — a| 2 ) > exp(— 7rr 2 ) 

n 

< exp(7rr 2 )Eexp(— n\ ^^ajXj + b{x\ — a| 2 ). 

i=l 

Note that for any z G R 2 , exp(— vr|z| 2 ) = J* c e(zt) exp(— vr| i | 2 ) cZt . Thus, 

n n 

P(y~^ diXi + G -B(a, r)) < exp(-7rr 2 ) / Ee((^^ a^Xj + bix',j)t)e{—at) exp(— 7r|t| 2 )<it. 
i=i ^ c i=i 

Next, because of independence we have |Ee(Q^™ =1 a^Xj + bix'j)t)\ = n?=i |Ee(xjaji + x' i bit) 
and so 



|Ee(x;a^ + x'ibit)] < \Ee(xiait + x-6jt)| 2 /2 + 1/2 

= E 6i?2 ^^e((a - ^)oit + (6 - Qht)/2 + 1/2 
= E 6)6i?u , cos (27rRe((a - + (6 - /2 + 1/2 

< exp ( - E Cl)&if , iC , ||Re(2(£i - £ 2 )M + 2(£ - Qht) ||^ /z 

where the random vector (£i> £2) is an identical independent copy of (£1, £2), and in the last 
inequality we estimated crudely | cos7rz| < 1 — sin 2 (7rz)/2 < 1 — 2||z||^ z < exp(— ||z||^ z ). 

□ 

Observe that, as (£1,^2) belongs to a given (fi, p) -family, so does the pair (wi,^) := ((£1 — 
£i)/2, (f 2 - £ 2 )/ 2 )- Intuitively, for E|^i| 2 = E|V> 2 | 2 = 1 and \p\ = |E[^i^ 2 ]| < 1, these two 
random variables are essentially not multiple of each other. We summarize this useful fact 
clS cl claim below. 

Claim A. 2. Assume that (a;i,u; 2 ) belongs to a given (p, p)- family. Then there exist posi- 
tive numbers a,5,co,Co and two Lebesgue-measurable sets Ri and i? 2 in the set {(x,y) G 
C 2 , Co < |x|, \y\ < Co} such that P((wi, L02) G P((wi, w 2 ) G R2) > 5 and \a/b — c/d\ > a 
for any (a, b) G R\ and (c, d) G i?2- 



Proof, (of Claim A. 2) Let eo be a sufficiently small positive constant to be chosen. There 
exist positive numbers Co , Co depending on u% , 0J2 and on eo such that the truncated random 
variables ^1 := wil Co< | Wl | <Co , ip 2 ■= w 2 l Co <|a; 2 |<Co satisfy the following 
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(1) l-e <E|^i| 2 ,E|^ 2 | 2 < l + e , 

(2) \p\-eo < |E[^iV 2 ]| < \p\+e . 

Observe that it suffices to justify the claim for the truncated pair (ipijife)* Set k to be 
a sufficiently large integer. We divide the square Q := {z £ C, |Im(z)|, |Re(z)| < Co/co} 
into k 2 closed smaller squares Q%, . . . , of size 2Co/kco each, and then divide the region 
R := {(x, y) G C 2 , co < \x\, \y\ < Co} into k 2 closed regions Ri, i = 1, . . . , k 2 depending on 
whether x/y belongs to Qi or not. Note that if (x,y) G R then the complex number x/y 
has absolute value bounded from above and below by Cq/cq and cq/Cq respectively, and so 
x/y € Q. 

We next claim that for sufficiently small 5 > (chosen dependently on cq, Co, k), there are 
squares Qi ,Qj that are not adjacent (i.e. sharing a common edge) and that P(ipi/ip2 £ 
Qi ) > S and P(ipi/ip2 £ Qjo) — Indeed, assuming otherwise, then P(ipi/ip2 S Qi) < 5 
holds for all but at most 9 adjacent squares. The larger square Q' formed by these adjacent 
ones has size at most 6Co/kcQ which satisfies 



P(^iM G Qf) > 1 - {k 2 - 9)5. 

We now concentrate on the event V'l/V^ £ Q' ■ Because of the definition, there exists a 
number c such that if x/y G Q' then the difference \x/y — c\ can be bounded crudely by 
6Co/fcco- Without loss of generality, we assume that \c\ > 1. (Otherwise we consider the 
ratio -02 /Vi instead). Clearly, 



|E[^ 2 ]| > |E[Vi^ 2 l^ lM eQ']| " |E[Vi^ 2 (l - 1 



The expectation of the second term can be bounded crudely from above by C 2 (k 2 — 9)5, while 
the expectation of the first term can be bounded from below by (|c| — 6Co//cco)E|?/>i| 2 — 
C 2 (k 2 — 9)5, which is at least (1 — 6Co/fcco)(l — eo) — Cq(/c 2 — 9)5 because |c| > 1 and 
EjV'il 2 > 1 — eo from item (1) above. Finally, by choosing k to be large enough (depending 
on eo, co, Co) and then 5 to be small enough (depending on eo, Co and k), we obtain a lower 
bound 1 — 2eo for |E[-0i^2]|- This is impossible as from item (2) we have lE^if/^]! < 
|p| + e < 1 - 2e . 

In summary, we have obtained two closed sub-regions Ri , Rj of R such that the cor- 
responding squares Qi and Qj are not adjacent and that both P^i/V^ G Qi ) and 
P(V'i/V ; 2 £ Qj ) are greater than 5. By definition, as Qi and Qj are not adjacent, we have 
\a/b — c/d\ > 2Co/kcQ as long as (a, b) G Ri and (c, d) G Rj , completing the proof. □ 



We now apply Claim A.l and A. 2 to prove Theorem 3.2 Our method here follows [33J with 
non-trivial modifications. 



Proof, (of Theorem 3.2) For short, set o! i := f3~ l cn, b\ :- 
z and z 1 the random variables 2(£i — ^2) an d 2(^ — 
identical independent of (^1,^2)- By definition, we have 



- j3 1 bi. Also, we will denote by 
respectively, where (£i,£ 2 ) 



is an 
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7 = supP(| J2( a i x i + b i x i) ~a\<P) = su P P(| ^(oja* + h 'i x 'i> -«I<1) = 



Set M := 2^4 log n where A is large enough. Prom Claim A.l and the fact that 7 > n 
we easily obtain 



1< 
2 ~ 



,. n 

/ exp(- £ E €li&i€iift ||Re(2(6 - + 2(6 - £)6jt)) || R/Z - vr|t| 2 )dt 

y|ti<Af i=1 v y 

/ exp(- VE^IlRefzo^ + ^HLz-Trltl 2 )^. (59) 



'\t\<M i=l 

Large level sets. For each integer < m < M we define the level set 
S m :=lteC:J2 E*,*' ||Re(*# + 2^) || R/Z + |t| 2 < 



m > . 



Then it follows from (59) that X^m<M f^i^m) ex P( — y + 1) > 7> where //(.) denotes the 
Lebesgue measure of a measurable set. Hence there exists m < M such that fJ>(S m ) > 
7 exp(f -2). 



Next, since S* m C -B(0, \/m), by the pigeon-hole principle there exist an absolute constant 
c and a ball B(xq, \) C -6(0, -^/m) such that 



n{B{x Q , -) n S m ) > cn(S m )m 1 > C7exp(^ - 2)m 



Consider ii,t 2 £ B(xq, 1/2) DS m . By Cauchy-Schwarz inequality (note that E* j;8 /||Re(;Zfl^£+ 
/2 



z'fe^i) || R / Z is a norm in £) we have 



^E^||Re(^(ii - t 2 ) + Js'^Cti - £ 2 )) Hr/z 

i=l 

/ n n \ 

< 2 ^(E z>z ,||Re(za , i t 1 + z'b'^ || R/Z + ^ E^, ||Re (zajt 2 + ^t 2 ) || R/Z < 4m. 



vi=l i=l 



Since £1 — £ 2 £ B(0, 1) and /j,(B(xq, f,) n 5 m — B(xq, ^) H 5 m ) > /j,(B(xq, §) n 5 m ), if we put 

n 

T:={£G 5(0,1), ^E^,||Re(^£ + ^t)|| R/z < 4m}, 



i=l 



THE ELLIPTIC LAW 



51 



then 



fi(T) > C7exp( — — 2)m . 

Discretization. Choose iV to be a sufficiently large prime (depending on the set T). Define 
the discrete box 



B := {ki/N + V^lk 2 /N : h, k 2 G Z, -JV < h, k 2 < N} . 

We consider all the shifted boxes z + Bq, where (Rez,Imz) G [0, 1/N] 2 . By the pigeon-hole 
principle, there exists zq such that the size of the discrete set (zq + -Bo) H T is at least the 
expectation, \(z + B ) n T\ > N 2 n(T) (to see this, we first consider the case when T is a 
box itself). 

Let us fix some to G ( z o + -Bo) H T. Then for any t G (to + -Bo) n T we have 



n 

E^||Re(^(t - t ) + z'b'S - t )) f n/z 

n 

< 2^E^,||Re(^t + /^t)|| 2 Vz 

i=i 

+ 2^E 2 ^||Re(za / i t + 2 , 6-to)||K /z < 16m. 
i=l 

Notice that t -t € -Bi := B - B = {h/N + y/-£k 2 /N : fci^ G Z,-2N < k u k 2 < 2N}. 
Thus there exists a subset S of size at least ciV 2 7 exp(^ — 2)m~ 1 of Si such that the 
following holds for any s G 5 

n 

y^E Z]Z / 1| Re (za'iS + z'b'js) ||r/ z < 16m. 
j=i 

Double counting and separation. By definition of 5, we have 



n 

E E H Re (^ + Hr/z < 16m|5|. 

seS i=l 



Notice th at, fo r z = 2(£i - £[) and z' = 2(£ 2 - £ 2 )> (^V 4 , -z'/ 4 ) belongs to the (/x, p)-family. 
By Claim A. 2, there exist (01,02) G 7£i and (c^c^) G 7^2 such that 



52 



HOI H. NGUYEN AND SEAN O'ROURKE 



n 

^^||Re((4c 1 a / i + 4 C2 60 S )||^ /z < le^V^I 

ses i=i 

and 



n 

^^l|Re((4 C ' 1 ^ + 4c 2 ^) s )||^ /z < m-^sy 

ses i=i 

From now on, for brevity, we denote by Vi the complex number 4cia^ + 4c2&^ for 1 < i < n, 
and by v n +i the complex number ^d x a\ + 4c' 2 6'j for 1 < i < n. We then have 

2n 

H Re (^)llR/z< 32<T 1 m|S|. 

ses i=i 

Switching to R 2 . Next, for convenience, we view each Vi as the vector (Ret>j, Imuj) and 
each s G S as the vector (Res,— Ims) of R 2 . So we can write Ke(vis) as (vi,s), and thus 
obtain the new estimate in R 2 , 

ses i=i 

Let n' be any number between n e and 2n. We say that an index 1 < i < 2n is bad if 

E.,,9 325~ 1 m\S\ 
ses n 

Then the number of bad indices is at most n'. Let V be the set of remaining ViS. Thus V 
contains at least 2n — n' elements. In the rest of the proof, we are going to show that the 
set V is close to a GAP. 

Dual sets. Consider an arbitrary good index i, we have 

ses 



Set k := y 2QA%S&- 1 m an< ^ ^ ^ := u {0})- By Cauchy-Schwarz inequality, for any 
v G Vfc we have 
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J22^\\(s,v)\\l /Z <f 



which implies 

^cos(27r< S ,<;» > M 

Observe that for any x 6 C(0, (the ball of radius 1/512 in the ||.||oo norm) and any 
s € S C C(0, 2) we always have cos(2-7r(s, x)) > 1/2 and sin(2-7r(s, x)) < 1/12. Thus for any 

n".,| 2 ). 

^ cos (27r( S , (v + x))) > U - ^ = y . 

On the other hand, 



/ > cos(27r(s, a;)) 1 cte < > / exp (2m(si — S2, x)) dx 

< \S\N 2 . 

Hence we deduce the following 



\S\, 2 \ \S\N 2 iV 2 

" WJW <<: W\' 



Now use the fact that S has large size, \S\ 3> iV 2 7exp(™ — 2)m , and N was chosen to be 
large enough so that Vk + C(0, g^) C [0, N] 2 , we have 

KVk + C(0, — )) « 7" 1 exp(-^ + 2)m. 
Thus, we have obtained the following 



fi (k(V U {0}) + C(0, ^g)J « 7 -1 exp(-j + 2)m. 



(60) 



The long range inverse theorem. Our next analysis relies on the following result from 
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Theorem A. 3. [33, Theorem 3.2] Let a > be constant. Assume that X is a subset of 
a torsion-free group such that £ I and \kX\ < k a \X\ for some integer k > 2 that may 
depend on \X\. Then, there is proper symmetric GAP P of rank r = 0(a) and cardinality 
O a (k~ r \kX\) such that X C P. 

Let D := 2048 x 16 x 5" 1 = @(5~ 1 ). We approximate each vector v of V by a closest vector 



u \fd 2 

\v — r^rr\ 2 < -prr, with u G Z . 



Dk ~ Dk 1 

Let U be the collection of all such u. Since X^eV ll^lli = 0(f3~ 2 ), we have 



a\\ 2 2 = 5 -i(k 2 p- 2 ) 



(61) 



It follows from (60) that 



m 



\k(U + C (0, 1))| = Ol (Dk) {zq) exp(-- + 2)m 



O ( 7 " 1 A; 2 exp(-^ + 2)m) , 



where Co(0,r) is the discrete cube {(x±,X2) G Z : < r}. 



Now we apply Theorem A. 3 to the set U + Co(0, 1) (notice that € U). That lemma implies 



there exists a proper GAP P = {YH=i x i9i '■ \ x i\ — C Z 2 containing [7 + Co(0, 1) which 
has small rank r = O(l), and small size 



P| = O (^7~ i A; 2 exp(-— + 2)mk 
= 0( 1 - 1 n' { - r+2)/2 ). 



Moreover, we have learned from Lemma |33l Lemma 4.4] that kP can be contained in a set 
ck(U + Co(0, 1)) for some c = 0(1). Using (61 ), we conclude that all the generators gt of P 
are bounded, 



\9i\\2 



0(kp~ 1 ) 



Next, since Cq(0, 1) C Q, the rank r of P is at least 2. We consider the following two cases. 
Case 1: r > 3. Recall that |P| = 0( 7 - 1 n /(3_r)/2 ) = 0(~i~ l /yfri). Let 
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It is clear that Q satisfies all of the conditions of Theorem |3.2| (Note that, in this case, we 
obtain a stronger approximation; almost all elements of V are 0(^=^)-close to Q.) 

Case 2: r = 2. Because the unit vectors e\ = (1, 0), e-i = (0, 1) belong to P = {^2i=i x i9i '■ 
\xi\ < Ni] C Z 2 , the set of generators 51,52 forms a base with the unit determinant of R 2 . 
In P, consider the set of lattice points with all coordinates divisible by k. We observe that 
(for instance, by |46j Theorem 3.36]) this set can be contained in a GAP P' of rank 2 and 

cardinality at most max (0(^|P|, l) = max ^0(7~ 1 /n"" //2 ), 1^ . (Here, we use the bound 

\P\ = 0{-f- 1 exp(-f )m).) Next, define 



It is easy to verify that Q satisfies all of the conditions of Theorem 3.2 (Note that, in this 
case, we obtain a stronger bound on the size of Q.) 



□ 



Appendix B. Proof of Lemma 14.31 
Set a' i: j := ay//3. By definition, 



7 = SUp^ P x x / (j ^ a'ijXiXj + ^ b i X i + ^2 b 'i X 'i ~ a| < lj > 

a ' b i' b 'i ij i i 



n- B . 



By Markov's inequality we have 
P x ,x' (I ^2 a 'ij x i x 'j + ^2 biXi + ^2 h 'i x 'i ~ a \ - X ) 

i,j i i 

= P(exp(-|| a'ijXix'j + Y b i x i + b 'i x 'i ~ a ' 2 - ex P(~|)) 

i,j i i 

< exp(^)E XjX / exp ( - ^| ^ aysc*:^ + ^ hxi + ^ b\d { - a| 2 ) 

i,j i i 

<exp(|) J \B^e[(Ya[ j x i x' j + Yb i x i + Y / b Wi)-A\eM-^\t\ 2 )dt 

< exp(|)(v^) 2 J |E x , x ,e[(^ d^x) + ]T b lXl + £ &&)) ■ t}\ exp(-||t| 2 )/(V2^) 2 dt 
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where in the fourth equation we used the identity exp(— §|x| 2 ) = f c e(xt) exp(— ||t| 2 )cit. 

Consider x = (x±, . . . ,x n ) as (x^,x^) and x' = (x^, . . . , x' n ) as (x^x^), where X{/,x^ 
and the vectors corresponding to i € U and i ^ U respectively. After a series 

applications of the identity J c exp(— ^\t\ 2 ) / (\f2it) 2 dt = 1 and Cauchy-Schwarz inequality, 
we obtain 



< 



< 



1 4 

J |E X)X ,e((^ a'ijxrfj + b i x i + E h 'i x 'i) • *) I exp(-^\t\ 2 )/(V2^) 2 dt 

C hi i i 

1 2 

J |E XjX ,e((^ a/^Xix'j + £ b iXi + £ &&)) • t) f exp(-||i| 2 )/(v/^r") 2 eft 

«,j i i 

1 2 

E x _ iX ,_e((J>^. + J>x, + ■ t) |' exp(-^|t| 2 )/(V2^) 2 dt 

/ E X[7!X ^E x _ )X /_ iy _ )y ^e(( ^ a^(x; -^)+ ^ - y^x'j + ^ 6^- - Vj ) 

c ieujeu ieu,jeu j&u 

- 2 

+ E W " y'i) + E 4(^4 " ViV'j)) ■ *) exp(-||t| 2 )/(v^) 2 dt 

< / E x _ iX ,_ )y _ jy _ |E XUjX / u e(( o^-XiCa/j - j/J-) + ^-(a* - y,)^) + H x j ~ Vj) 



i£U,j<=u ieujeu j&u 

2 



+ E W - ^) + E 4(^4- " W^)) • *) exp(-^|t| 2 )/(v^) 2 dt 

ieu,jeu 

+ ]T o^x* - y*)(^ - yj)) • i) exp(-^|t| 2 )/(v / 2^) 2 ^ 

i£U,j£U 



E x ( -y, .x, .y ; .x; , y ; .x; . 



= ^E VjW e(( ^ a' ij v i w j + ^ a'^Wj) ■ ij exp(-||t| 2 )/(v / 2vr) 2 ^ 



i&u,jeu 



ieujeu 



(l/\/2^) 2 E V)W exp(-|| ^ o^-Ui«;j+ ^ o^Wjl 2 ), 



(62) 



ieu,jeu 



i&U,j&U 



where (yu,y'u) an d (yt/iYf/) are independent identical copies of (x^x^) and (x^,x^.) 
respectively, and v := x — y, w := x' — y'. 



Thus 
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7 4 = (P X)X '(| ^2 a 'ij X i X 'j + ^2 biXi + ^2 b 'i X 'i ~ °l - 1 ) 

i,j i i 

< exp(4vr)(2vr) 4 ^ jf |E x>x ,e[(^ a^a^ + ^ 6^ + ^ 6^)) • i] | exp(-||t| 2 )/(v^) 2 <ftj 

< exp(47r)(27r) 3 E VjW exp(-^| ^ a'^Wj + ^ a^u^-| 2 ). 

Because 7 > n _B , the inequality above implies that 



p 

1 v,w 



^2 a' ij v i w j + ^2 a 'ij v i w j\ = B (yfiogn)) > ^7 4 /((2vr) 3 exp(4vr)). 



Scaling back to Ojj, we thus obtain 



Pv,w(| ^2 a ij v i w j + ^2 a iJ V i W j | = OflGMogn)) > 7 2 7 4 /((2vr) 3 exp(47r)), 



completing the proof. 
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